This site requires JavaScript, please enable it in your browser!
Greenfoot back
JasonZhu
JasonZhu wrote ...

2014/3/8

How improve a String related program

1
2
JasonZhu JasonZhu

2014/3/8

#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
/**
    * Searches the data file and prints out the stats of the file.
    */
   public static void wordStats() throws Exception
   {
       File file = new File("words.txt");
       Scanner in = new Scanner(new FileInputStream(file));
       int words = 1;
       String longestWord = new String();
       String shortestWord = new String();     
       String line = in.nextLine();
       line = line.trim(); 
       char ch[]= new char[line.length()];       
       try{
           for(int i=0;i<line.length();i++){
               ch[i]= line.charAt(i);
               if( ((i>0)&&(ch[i]!=' ')&&(ch[i-1]==' ')) || ((ch[0]!=' ')&&(i==0)) ){
                   words++;
               }
           }
           System.out.println("There are "+words+" words.");
           System.out.println("The longest word is "+longestWord+".");
           System.out.println("The shortest word is "+shortestWord+".");
       }catch(Exception e)
       {
           System.out.println(e.toString());
       }
   }
Currently, the word counter is functional. What I am left of to do is to print out every word in the document separately in 5 separate columns. Example: mat map mad mid mom mop mud mum met may man men mob mug mix moo mow And return the longest and shortest word. What would be the best way to go about this problem given my current approach? Thanks for all the help in advance guys!
danpost danpost

2014/3/9

#
I still do not think that you were very clear as to the format of the file. You indicated (though I am not to sure on this) that each line in the file has one word. But from the code you are giving, there could be more than one in a line. Also, would there be any punctuation within the lines of the file?
JasonZhu JasonZhu

2014/3/9

#
Sorry for the confusion. This is a program that prints out the statistics of text from an input file. This program was different from the word filter in the sense that the text files has all the words in one single line. The example was just to emphasize the output layout. Basically the program will read each word in the text file and reprint them out in 5 separate column like shown above. It then prints out a word count, the longest word found, and the shortest word found. I got the word count functional. I'm only having trouble with the reprinting in 5 columns and the printing out of the longest and shortest word. Yes there is punctuation and spaces. In essence, I'm coding a word counter like Word, but I reprint the words from the files back out in 5 separate columns ignoring punctuation and printing out the longest and shortest word found.
JasonZhu JasonZhu

2014/3/9

#
I also just realized that my examples formatting was off. I'll provide a new example: Input: Hi and welcome to a wonderful event hosted by yours truly, Bob Jones. Output: Hi \t and \t welcome \t to \t a \t wonderful \t event \t hosted \t by \t yours \t truly \t Bob \t Jones \t 12 words. Longest word: wonderful Shortest word: a
danpost danpost

2014/3/9

#
I think I would use a 'do' loop which exits when the index counter reaches the length of the line. In it, a while loop to skip any leading non-alphabetic characters; reset the word field to an empty string and use another while loop to add the alphabetic characters to the 'word' field. Then check to make sure that there is a word (that the 'word' field is not still an empty string) and, if there is a word, do the longest and shortest word checks/changes, print the word and increment the word counter which is immediately checked for divisibility by 5 to do a 'println'. After the 'do' loop, line feed a couple times and print the word count and the longest and shortest words found. I would not create the char array like you did above, but use 'line.toUpperCase().charAt(index)' to get the character from the line and then compare that with 'A' and 'Z' to determine if it is alpha or not.
JasonZhu JasonZhu

2014/3/9

#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
/**
 * Searches the data file and prints out the stats of the file.
 */
public static void wordStats() throws Exception
{
    File file = new File("words.txt");
    Scanner in = new Scanner(new FileInputStream(file));
    int words = 1;
    String current = "";
    String longestWord = "";
    String shortestWord = "";     
    String line = in.nextLine();
    line = line.trim(); 
    char ch[]= new char[line.length()];       
    try{
        for(int i=0;i<line.length();i++){
            ch[i]= line.charAt(i);
            if( ((i>0)&&(ch[i]!=' ')&&(ch[i-1]==' ')) || ((ch[0]!=' ')&&(i==0)) ){
                words++;
            }
        }
        while(line.hasNext()){
            current = line.next();
            if(current.length()>longestWord.length()){
                longestWord = current;
            }
            if(current.length<shortestWord.length()){
                shortestWord = current;
            }
        }
        System.out.println("There are "+words+" words.");
        System.out.println("The longest word is "+longestWord+".");
        System.out.println("The shortest word is "+shortestWord+".");
    }catch(Exception e)
    {
        System.out.println(e.toString());
    }
}
I'm having trouble making this work. I currently have the current code, but I am not very familiar with he char setup, any suggestions to improve? Thanks in advance!
danpost danpost

2014/3/9

#
I kind of looks like you are picking and choosing what to put where and what to use where. There really does not seems to be much thought in putting the code together. First, you should make an outline of what you need to do. What you are trying to do (once you have a line to parse): (1) break the line up into words - for each word found (a) print word (b) compare/set to longest (c) compare/set to shortest (d) increment word count (2) print word count (3) print longest word (4) print shortest word Everything in the above is pretty much straight-forward except for breaking the line up into words. So, we now detail how to accomplish that. Breaking line into words: (1) interate through the line character by character (a) while the character is not an alpha character, goto next character (b) clear word field (c) while the character is an alpha character, add character to word and goto next character The only thing missing in the detailing is checking for the end of the line which will need to be done during both 'while' parts, (a) and (c), as well as in the main iterating part, (1).
JasonZhu JasonZhu

2014/3/9

#
Thank you for helping me break this problem into parts. I see the importance of planning now, but understanding what methods to use when "breaking the line up into words" is serving really difficult for me. I just started learning the String class. This is because I need to keep the caps of the words the way they are. So would I need 2 instance String alphabets? private static String lowercaseAlphabet = "abcdefghijklmnopqrstuvwxyz"; private static String uppercaseAlphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
danpost danpost

2014/3/9

#
You should not need either string alphabets. You can use a comparison like line.toUpperCase().charAt(index) >= 'A' && line.toUpperCase().charAt(index) <= 'Z' to check for alpha characters; and line.toUpperCase().charAt(index) < 'A' || line.toUpperCase().charAt(index) > 'Z' to check for non-alpha characters. These can be used without changing the actual character array of the 'line' string.
JasonZhu JasonZhu

2014/3/9

#
You can you the greater than and less than signs to check alpha characters?
JasonZhu JasonZhu

2014/3/9

#
Do you think it would help to use the tokenizer?
danpost danpost

2014/3/9

#
To compare characters ... yes. Each character has a numeric value that is used for comparison. Please refer to the Primitive Data Types page of the Java Tuturials on Learning the Basics.
danpost danpost

2014/3/9

#
JasonZhu wrote...
Do you think it would help to use the tokenizer?
Not necessary.
JasonZhu JasonZhu

2014/3/9

#
I thought the tokenizer was easier to understand, so I decided to put it to use. Figured there's no harm in learning something new. So my output is now: Welcome to Fort Richmond Collegiate a Grade 10 to Grade 12 high school Fort Richmond Collegiate is committed to the pursuit of excellence and providing opportunities in a safe and enriched educational community which encourages life-long learning and social responsibility Our reputation for academic excellence is due to the commitment of students teachers and parents the support of the community and programs which challenge students to maximize their potential for growth There are 72 words. The longest word is responsibility. The shortest word is a. My last question now is: Is there a way to make it so that the columns look neater?
JasonZhu JasonZhu

2014/3/9

#
Oh, I forgot to say that I used "\t" to create the indents.
There are more replies on the next page.
1
2