Why Should I Care About Regular Expressions?
Why Should I Care About Regular Expressions?
==> cat sample
...
print("ENTER YOUR NAME: ");
...
printf( "CAN'T OPEN %s\n", file );
...
printf ( "OPENING FILE: %s\n", filename );
...
printf ( "CAN'T OPEN FILE: %s\n", filename );
...
printf ("CAN'T OPEN FILE: %s\n","/usr/data");
...
fprint(stderr,"CAN'T OPEN %s\n",errfile);
...
value=two;
...
==> sed -n \
'1,$s/\( *\)printf.*"\(.*\)",\(.*\)/\1fprintf( stderr, "\2", \3/p' sample
fprintf( stderr, "CAN'T OPEN %s\n", file );
fprintf( stderr, "OPENING FILE: %s\n", filename );
fprintf( stderr, "CAN'T OPEN FILE: %s\n", filename );
fprintf( stderr, "CAN'T OPEN FILE: %s\n", "/usr/data");
==>
\1
\2
\3
Notes:
This example is an interesting one showing a decent dilemma when massive code changes are in order. In this example the scenario is as follows: change all printf() function calls to fprint() function calls while preserving the message to be printed.
The expression for the pattern match looks cryptic, however after you create your algorithm its not that difficult to transcribe the words to an expression. The algorithm for the expression is the following:
- place white space leading up to printf into positional parameter 1; denoted by \( *\).
- Match printf
- Match any printable character after up until the double quote
- Match the string inside the double quotes and assign that to positional parameter 2; denoted by first occurrence of \(.*\)
- Match a comma
- Assign everything to the right of the command to positional parameter 3; denoted by the second occurrence of \(.*\)
Unfortunately the expression to match wasn't specific enough; as we should have been matching for a string indicating it couldn't open the file. The following one command line produces the accurate results:
==> sed -n \ '1,$s/\( *\)printf.*"\(.*CAN.T OPEN.*\)",\(.*\)/\1fprintf( stderr, "\2", \3/p' \ sample fprintf( stderr, "CAN'T OPEN %s\n", file ); fprintf( stderr, "CAN'T OPEN FILE: %s\n", filename ); fprintf( stderr, "CAN'T OPEN FILE: %s\n", "/usr/data");
There may still room for improvement in refining the expression. Think of how one may have multiple statements on a line delimited by semicolons.