strtok() verschluckt alle f's
-
Hallo,
ich habe ein esotherisches Problem mit strtok(). Ich nehme einen char** der mehrere Zeilen eines Textes einthaelt (also einen Pointer auf mehrere char*, die jeweils die Zeilen dieses Textes enthalten) und zerlege das ganze nun in die einzelnen Woerter, dann suche ich ob eines der Worter einem "pattern" entspricht. - es geht um strtok(), sollte nur ein Demo werden.Der Code compiliert, laeuft und funktioniert eig - nur verschwinden mir dabei saemtliche 'f's und 'F's??!!! Ist das normal? Warum gerade 'f'??
Das Programm kompiliere ich auf Cygwin (gcc, -g -Wall -std=c99) und lasse es auch unter Cygwin laufen, dabei arbeite ich auf einem USBstick (is der vllt irgendwie defekt?). Ich wollte das ganze auch auf einem Debian mal durchkompilieren lassen. Jetzt bekomme ich allerdings dort die Meldung, dass irgendetwas mit einer Zeile 22 (bezieht sich auf das unten angehaengte Listing in dem ich die Fehlermeldung in den Kommentar geschrieben habe) nicht stimmt, naja, der Rechner is auf UTF-8 konfiguriert, cygwin afair noch ANSI - liegt das daran? Ich kann den Code dennoch ohne Schwierigkeiten unter Emacs auf dem Debian aufmachen.. seltsam.
Fehler unter Debian:
user@machine:/mnt/usbstick/programming/TODO_Strtok$ make
gcc -c -O -g -Wall -std=c99 Strtok.c
Strtok.c:22: error: expected identifier or ’¡Æ(’¡Ç before ’¡Æ__extension__’¡Ç
Strtok.c:22: error: expected identifier or ’¡Æ(’¡Ç before ’¡Æ)’¡Ç token
make: ** [Strtok.o] Erro 1Also:
1. Warum verschluckt strtok() alle 'f's? - sh. den Output, zB 3 Zeile (Index: 2): "fast" wird zu "ast"!
2. Warum kann ich den Code nicht unter Linux kompilieren, aber unter Cygwin?
3. Evtl koennte den Code mal einer auf seiner Kiste kompilieren und mir mitteilen ob dort auch die 'f's verschwinden, danke schoen// Strtok.c /* Separate a string by tokens and search for a pattern contained in that string. user@machine:/mnt/usbstick/programming/TODO_Strtok$ make gcc -c -O -g -Wall -std=c99 Strtok.c Strtok.c:22: error: expected identifier or ’¡Æ(’¡Ç before ’¡Æ__extension__’¡Ç Strtok.c:22: error: expected identifier or ’¡Æ(’¡Ç before ’¡Æ)’¡Ç token make: ** [Strtok.o] Erro 1 //*/ #include <stdio.h> #include <stdlib.h> #include <string.h> #define TEXT_SIZE 4 #define PATTERN "Jill" extern char* strcpy(char*, const char*); extern size_t strlen(const char*); extern char* strtok(char*, const char*); extern int strcmp(const char*, const char*); extern char* strcat(char*, const char*); int check_line_for_pattern( char* line, const unsigned long int LINE_SIZE, char* pattern, const unsigned long int PATTERN_SIZE); int main(int argc, char** argv) { // init a char** with some text printf("Strtok - separate a string into tokens\n\ninit a char** with some text:\n"); char** text; if( (text = malloc(sizeof(char*)*4)) == NULL) return -1; if( (text[0] = malloc(sizeof(char) * (strlen("Jack and Jill went up the hill to fetch a pail of water") + 1))) == NULL) return -1; strcpy(text[0], "Jack and Jill went up the hill to fetch a pail of water"); if( (text[1] = malloc(sizeof(char) * (strlen("Jack fell down and broke his crown and Jill came tumbling after.") + 1))) == NULL) return -1; strcpy(text[1], "Jack fell down and broke his crown and Jill came tumbling after."); if( (text[2] = malloc(sizeof(char) * (strlen("Up got Jack, and home did trot as fast as he could caper") + 1))) == NULL) return -1; strcpy(text[2], "Up got Jack, and home did trot as fast as he could caper"); if( (text[3] = malloc(sizeof(char) * (strlen("He went to bed and bound his head with vinegar and brown paper.") + 1))) == NULL) return -1; strcpy(text[3], "He went to bed and bound his head with vinegar and brown paper."); // output unsigned int cnt; for(cnt = 0; cnt < TEXT_SIZE; ++cnt) printf("%i. line: \n#%s#\n", cnt, text[cnt]); // set a pattern char* pattern; if( (pattern = malloc(sizeof(PATTERN))) == NULL) return -1; strcpy(pattern, PATTERN); printf( "\nset a pattern: #%s#\n\n", pattern); // test which line contains the pattern printf( "test which line contains the pattern:\n\n"); for(cnt = 0; cnt < TEXT_SIZE; ++cnt){ printf("%d.line - CHECKING: \n\"%s\"\n", cnt, text[cnt]); // DEBUG if(check_line_for_pattern(text[cnt], strlen(text[cnt]), pattern, strlen(pattern))){ printf( "\tContains the pattern: \"%s\"\n\n", pattern); }else{ printf("\tNothing!\n\n"); // DEBUG } } printf("READY.\n"); return 0; }; /* reads a line and checks if it contains a certain pattern / word returns 1 in case of occurance, 0 in case of no occurance or error Works, but only reads in //*/ int check_line_for_pattern( char* line, const unsigned long int LINE_SIZE, char* pattern, const unsigned long int PATTERN_SIZE) { if(pattern == NULL) return 0; if(line == NULL) return 0; char* word = NULL; char token = ' '; char** ppWordListTemp = NULL; char temp[LINE_SIZE+1]; char** ppWordList = NULL; // allocate space for each word XXX unsigned int wordListSize = 0; // processing each line: strtok into pieces by ' ', append to a list strcpy(temp, line); strcat(temp, "\n"); // TODO: check appending an '\n' // init strtok() with pointer to the '\0'-ed temp if( (word = strtok(temp, &token)) == NULL) return 0; // allocate space for one new pointer in the list, allocate space by the size of // word for this pointer, append to the list, and increment counter if( (ppWordList = (char**) malloc(sizeof(char*))) == NULL) return 0; ++wordListSize; if( (ppWordList[wordListSize - 1] = malloc( sizeof(char) * (strlen(word) + 1))) == NULL) return 0; strcpy( ppWordList[wordListSize - 1], word); // GDB - check ppWordList, wordListSize, word do{ // reset the temp list pointer ppWordListTemp = NULL; // use strtok() with NULL, because already inited if( (word = strtok(NULL, &token)) == NULL) break; // reallocate space for one new pointer in the list // therefore allocate a temp pointer, assign the addresses of the list to this and // assign the ppWordListTemp's address to the ppWordList pointer if( (ppWordListTemp = malloc((wordListSize+1) * sizeof(char*))) == NULL) return 0; unsigned int cnt; for(cnt = 0; cnt < wordListSize; ++cnt) ppWordListTemp[cnt] = ppWordList[cnt]; // ppWordListTemp points to the same address!! // allocate ppWordListTemp[wordListSize] for one new element (index: wordListSize, still not incremented!) // the last element will be put into new allocated space if( (ppWordListTemp[wordListSize] = malloc( strlen(word) * (sizeof(char) + 1))) == NULL); strcpy(ppWordListTemp[wordListSize], word); ppWordList = ppWordListTemp; ++wordListSize; }while(word); // and search the list for the pattern unsigned int cnt; for(cnt = 0; cnt < wordListSize; ++cnt){ printf("DEBUG: %i. line\tpattern:\"%s\" - \"%s\"\n", cnt, pattern, ppWordList[cnt]); // DEBUG // BETTER: use improved implementation of strcmp()!!! if(0 == strcmp(pattern, ppWordList[cnt])){ for(cnt = 0; cnt < wordListSize; ++cnt) free(ppWordList[cnt]); free(ppWordList); return 1; } } // free allocated space for(cnt = 0; cnt < wordListSize; ++cnt) free(ppWordList[cnt]); free(ppWordList); free(ppWordListTemp); return 0; };
-
Hallo Fabeltier,
ich habe mal dein Programm bei mir ausprobiert und ein paar Warnings
beseitig. Das angehängte verbesserte Programm kompiliert bei mir mit:gcc -W -Wall -pedantic -std=c99 -o a a.c
/* Separate a string by tokens and search for a pattern contained in that string. user@machine:/mnt/usbstick/programming/TODO_Strtok$ make gcc -c -O -g -Wall -std=c99 Strtok.c Strtok.c:22: error: expected identifier or ’¡Æ(’¡Ç before ’¡Æ__extension__’¡Ç Strtok.c:22: error: expected identifier or ’¡Æ(’¡Ç before ’¡Æ)’¡Ç token make: ** [Strtok.o] Erro 1 //*/ #include <stdio.h> #include <stdlib.h> #include <string.h> #define TEXT_SIZE 4 #define PATTERN "Jill" extern char* strcpy(char*, const char*); extern size_t strlen(const char*); extern char* strtok(char*, const char*); extern int strcmp(const char*, const char*); extern char* strcat(char*, const char*); int check_line_for_pattern( char* line, const unsigned long int LINE_SIZE, char* pattern); int main(void) { /* init a char** with some text */ printf("Strtok - separate a string into tokens\n\ninit a char** with some text:\n"); char** text; if( (text = malloc(sizeof(char*)*4)) == NULL) return -1; if( (text[0] = malloc(sizeof(char) * (strlen("Jack and Jill went up the hill to fetch a pail of water") + 1))) == NULL) return -1; strcpy(text[0], "Jack and Jill went up the hill to fetch a pail of water"); if( (text[1] = malloc(sizeof(char) * (strlen("Jack fell down and broke his crown and Jill came tumbling after.") + 1))) == NULL) return -1; strcpy(text[1], "Jack fell down and broke his crown and Jill came tumbling after."); if( (text[2] = malloc(sizeof(char) * (strlen("Up got Jack, and home did trot as fast as he could caper") + 1))) == NULL) return -1; strcpy(text[2], "Up got Jack, and home did trot as fast as he could caper"); if( (text[3] = malloc(sizeof(char) * (strlen("He went to bed and bound his head with vinegar and brown paper.") + 1))) == NULL) return -1; strcpy(text[3], "He went to bed and bound his head with vinegar and brown paper."); /* output */ unsigned int cnt; for(cnt = 0; cnt < TEXT_SIZE; ++cnt) printf("%i. line: \n#%s#\n", cnt, text[cnt]); /* set a pattern */ char* pattern; if( (pattern = malloc(sizeof(PATTERN))) == NULL) return -1; strcpy(pattern, PATTERN); printf( "\nset a pattern: #%s#\n\n", pattern); /* test which line contains the pattern */ printf( "test which line contains the pattern:\n\n"); for(cnt = 0; cnt < TEXT_SIZE; ++cnt){ printf("%d.line - CHECKING: \n\"%s\"\n", cnt, text[cnt]); /* DEBUG */ if(check_line_for_pattern(text[cnt], strlen(text[cnt]), pattern)){ printf( "\tContains the pattern: \"%s\"\n\n", pattern); }else{ printf("\tNothing!\n\n"); /* DEBUG */ } } printf("READY.\n"); return 0; } /* reads a line and checks if it contains a certain pattern / word returns 1 in case of occurance, 0 in case of no occurance or error Works, but only reads in */ int check_line_for_pattern( char* line, const unsigned long int LINE_SIZE, char* pattern) { if(pattern == NULL) return 0; if(line == NULL) return 0; char* word = NULL; char token = ' '; char** ppWordListTemp = NULL; char temp[LINE_SIZE+1]; char** ppWordList = NULL; /* allocate space for each word XXX */ unsigned int wordListSize = 0; /* processing each line: strtok into pieces by ' ', append to a list */ strcpy(temp, line); strcat(temp, "\n"); /* TODO: check appending an '\n' */ /* init strtok() with pointer to the '\0'-ed temp */ if( (word = strtok(temp, &token)) == NULL) return 0; /* allocate space for one new pointer in the list, allocate space by the size of word for this pointer, append to the list, and increment counter */ if( (ppWordList = (char**) malloc(sizeof(char*))) == NULL) return 0; ++wordListSize; if( (ppWordList[wordListSize - 1] = malloc( sizeof(char) * (strlen(word) + 1))) == NULL) return 0; strcpy( ppWordList[wordListSize - 1], word); /* GDB - check ppWordList, wordListSize, word */ do{ /* reset the temp list pointer */ ppWordListTemp = NULL; /* use strtok() with NULL, because already inited */ if( (word = strtok(NULL, &token)) == NULL) break; /* reallocate space for one new pointer in the list therefore allocate a temp pointer, assign the addresses of the list to this and assign the ppWordListTemp's address to the ppWordList pointer */ if( (ppWordListTemp = malloc((wordListSize+1) * sizeof(char*))) == NULL) return 0; unsigned int cnt; for(cnt = 0; cnt < wordListSize; ++cnt) ppWordListTemp[cnt] = ppWordList[cnt]; /* ppWordListTemp points to the same address!! allocate ppWordListTemp[wordListSize] for one new element (index: wordListSize, still not incremented!) the last element will be put into new allocated space */ if( (ppWordListTemp[wordListSize] = malloc( strlen(word) * (sizeof(char) + 1))) == NULL) exit(0); strcpy(ppWordListTemp[wordListSize], word); ppWordList = ppWordListTemp; ++wordListSize; }while(word); /* and search the list for the pattern */ unsigned int cnt; for(cnt = 0; cnt < wordListSize; ++cnt){ printf("DEBUG: %i. line\tpattern:\"%s\" - \"%s\"\n", cnt, pattern, ppWordList[cnt]); /* DEBUG */ /* BETTER: use improved implementation of strcmp()!!! */ if(0 == strcmp(pattern, ppWordList[cnt])){ for(cnt = 0; cnt < wordListSize; ++cnt) free(ppWordList[cnt]); free(ppWordList); return 1; } } /* free allocated space */ for(cnt = 0; cnt < wordListSize; ++cnt) free(ppWordList[cnt]); free(ppWordList); free(ppWordListTemp); return 0; }
Wenn ich das Programm dann bei mir Laufen lasse, erhalte ich folgende Ausgabe:
Strtok - separate a string into tokens init a char** with some text: 0. line: #Jack and Jill went up the hill to fetch a pail of water# 1. line: #Jack fell down and broke his crown and Jill came tumbling after.# 2. line: #Up got Jack, and home did trot as fast as he could caper# 3. line: #He went to bed and bound his head with vinegar and brown paper.# set a pattern: #Jill# test which line contains the pattern: 0.line - CHECKING: "Jack and Jill went up the hill to fetch a pail of water" DEBUG: 0. line pattern:"Jill" - "Jack" DEBUG: 1. line pattern:"Jill" - "and" DEBUG: 2. line pattern:"Jill" - "Jill" Contains the pattern: "Jill" 1.line - CHECKING: "Jack fell down and broke his crown and Jill came tumbling after." DEBUG: 0. line pattern:"Jill" - "Jack" DEBUG: 1. line pattern:"Jill" - "fell" DEBUG: 2. line pattern:"Jill" - "down" DEBUG: 3. line pattern:"Jill" - "and" DEBUG: 4. line pattern:"Jill" - "broke" DEBUG: 5. line pattern:"Jill" - "his" DEBUG: 6. line pattern:"Jill" - "crown" DEBUG: 7. line pattern:"Jill" - "and" DEBUG: 8. line pattern:"Jill" - "Jill" Contains the pattern: "Jill" 2.line - CHECKING: "Up got Jack, and home did trot as fast as he could caper" DEBUG: 0. line pattern:"Jill" - "Up" DEBUG: 1. line pattern:"Jill" - "got" DEBUG: 2. line pattern:"Jill" - "Jack," DEBUG: 3. line pattern:"Jill" - "and" DEBUG: 4. line pattern:"Jill" - "home" DEBUG: 5. line pattern:"Jill" - "did" DEBUG: 6. line pattern:"Jill" - "trot" DEBUG: 7. line pattern:"Jill" - "as" DEBUG: 8. line pattern:"Jill" - "fast" DEBUG: 9. line pattern:"Jill" - "as" DEBUG: 10. line pattern:"Jill" - "he" DEBUG: 11. line pattern:"Jill" - "could" DEBUG: 12. line pattern:"Jill" - "caper " Nothing! 3.line - CHECKING: "He went to bed and bound his head with vinegar and brown paper." DEBUG: 0. line pattern:"Jill" - "He" DEBUG: 1. line pattern:"Jill" - "went" DEBUG: 2. line pattern:"Jill" - "to" DEBUG: 3. line pattern:"Jill" - "bed" DEBUG: 4. line pattern:"Jill" - "and" DEBUG: 5. line pattern:"Jill" - "bound" DEBUG: 6. line pattern:"Jill" - "his" DEBUG: 7. line pattern:"Jill" - "head" DEBUG: 8. line pattern:"Jill" - "with" DEBUG: 9. line pattern:"Jill" - "vinega" DEBUG: 10. line pattern:"Jill" - "and" DEBUG: 11. line pattern:"Jill" - "brown" DEBUG: 12. line pattern:"Jill" - "paper. " Nothing! READY.
Dies sieht eigentlich ganz gut aus.
Resultat: bei mir klappt es wunderbar.
Gruß mcr
PS: vielleicht vergleichst du selbst mal, was ich an deinem Code geändert habe.
Das war nicht wirklich viel.
-
Hallo,
Ich komm mir zwar irgendwie bloed vor jetzt, aber ich finde den Unterschied nicht! Ich sehe, dass Du die Kommentare von '//' in '/*' Kommentare geaendert hast und die ';' am Ende der Funktionen weggeloescht hast (ne Stilsache?). Aber mein Diff auf cygwin zeigt mir gleiche Zeilen an?! Gut die laenge des fuer Pattern allokierten Speichers wird nicht mituebergeben, aber das kanns ja wohl auch nicht sein?! Ausserdem werden jetzt "identische" Zeilen bei mir angezeigt, vllt enthaelt ja mein File dort irgendwelche nicht angezeigten Zeichen, die Probleme machen (Zeichen Codierung) - aber ich finde definitiv keinen Unterschied in der Implementation, waere nett das noch zu erklaeren - ausserdem welche Warnungen hast Du bekommen vom Compiler (nicht genutzter Parameter?)?
-
Hier meine Compiler-Warnings von deinem Code:
b.c:29: Warnung: unbenutzter Parameter »argc« b.c:29: Warnung: unbenutzter Parameter »argv« b.c:68: Warnung: ISO-C erlaubt kein zusätzliches »;« außerhalb einer Funktion b.c: In Funktion »check_line_for_pattern«: b.c:122: Warnung: leerer Körper in einer if-Anweisung b.c: Auf höchster Ebene: b.c:78: Warnung: unbenutzter Parameter »PATTERN_SIZE« b.c:146: Warnung: ISO-C erlaubt kein zusätzliches »;« außerhalb einer Funktion
Kompiliert habe ich mit:
[code]gcc -W -Wall -std=c99 -pedantic b.c[/cpp]Compilerversion: gcc (GCC) 4.1.2
Stimmt, ich habe lediglich die obigen Warnings behoben.
Aber auch mit den Warnings, ist das Ergebnis korrekt.Warum das bei dir nicht funktionieren sollte, weiß ich jetzt auch nicht.
Gruß mcr
-
Ja, echt seltsam, nun passt bei mir die Ausgabe des alten Codes auch - aber ich hab die Cygwin Installation nun auch mal upgedatet, da ich weitere generelle Probleme mit dem System hatte (diff ging zB gar nich auf Cygwin, Emacs hat gesponnen, etc). Wird wohl mehr an der Cygwin Installation gelegen haben. Danke trotzdem fuer die Hilfe!!!