How to use grep and cut to get the ip and a specific substring from a string?

From the log file of the web server you need to obtain a pair of the form IP: __utma. Line in the standard file, only a "|" saved cookies.

80.247.101.110 - - [16/Aug/2013:06:58:23 +0400] "POST /edit HTTP/1.1" 404 327 "http://site.ru/" "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.22 (KHTML, like Gecko) Chrome/25.0.1364.172 YaBrowser/1.7.1364.17262 Safari/537.22" "-" | "__utma=230214667.2058679839.1371519930.1376615440.1376617932.52; __utmc=230214667; __utmz=230214667.1371519930.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none)"
October 3rd 19 at 04:14
4 answers
October 3rd 19 at 04:16
Solution
Time went so booze, then:
cat test.log | awk '{print $1,":",substr($0, index($0,"__utma"),64)}' | sed 's/ : __utma=/: /' | sed 's/;//'

I hope I understood you correctly?
Output:
80.247.101.110: 230214667.2058679839.1371519930.1376615440.1376617932.52
Well, actually that would be an example of a result of having :) - Laurence.Crist commented on October 3rd 19 at 04:19
but if non-fixed size __utma? - Lorena26 commented on October 3rd 19 at 04:22
In. Like so, then I hastened:
cut-d "" -f1,25 test.log | awk '{print $1,":",$2}' | sed 's/ : "__utma=/: /' | sed 's/;//'


80.247.101.110: 230214667.2058679839.1371519930.1376615440.1376617932.52
- toni26 commented on October 3rd 19 at 04:25
No, reaccurance obtained. If the handle of the browser is shorter by one word is, don't fall into the number field 25. - Laurence.Crist commented on October 3rd 19 at 04:28
In the case of a length change __utma, your also not an option. How to be? - toni26 commented on October 3rd 19 at 04:31
And so? With just one line worse than the file.
cat ./data.txt | awk '{print $1,":",substr($0, index($0,"__utma"),index($0,";"))}' | awk '{print $1, $3}' | sed 's/ __utma=/: /g'| sed 's/;//'

Sorry, not copied - Laurence.Crist commented on October 3rd 19 at 04:34
Your option I have not earned the right, and so like so:
cat test.log | awk '{print $1,":",substr($0, index($0,"__utma"),index($0,";"))}' | awk '{print $1, $3}' | sed 's/ __utma=/: /' | sed 's/;//'

80.247.101.110: 230214667.2058679839.1371519930.1376615440.1376617932.52
- toni26 commented on October 3rd 19 at 04:37
Well, logically :) you Have a file test.log, and I have a file data.txt - Laurence.Crist commented on October 3rd 19 at 04:40
In this case, everything is fine, you just review change. I checked your option:
cat test.log | awk '{print $1,":",substr($0, index($0,"__utma"),index($0,";"))}' | awk '{print $1, $3}' | sed 's/" __utma"/": "/'

Then you just finished it. - toni26 commented on October 3rd 19 at 04:43
Well, I'm not that line skopipastil from the beginning. :)
The main thing that worked. Good luck with the logs. - Laurence.Crist commented on October 3rd 19 at 04:46
October 3rd 19 at 04:18
If you can awk, like this:
cat ./data.txt | awk '{print $1,":",substr($0, index($0,"__utma"),64)}'
That is, if __utma= always value this long - Laurence.Crist commented on October 3rd 19 at 04:21
My conclusion is this:
80.247.101.110 : __utma=230214667.2058679839.1371519930.1376615440.1376617932.52;
And shouldn't it say:
80.247.101.110: 230214667.2058679839.1371519930.1376615440.1376617932.52? - Lorena26 commented on October 3rd 19 at 04:24
It seemed that the space there, sorry. Then a little longer:
cat ./data.txt | awk '{print $1,":",substr($0, index($0,"__utma"),64)}' | sed 's/ :/:/'
- toni26 commented on October 3rd 19 at 04:27
If the __utma is not necessary, it is also, accordingly, granot
cat ./data.txt | awk '{print $1,":",substr($0, index($0,"__utma"),64)}' | sed 's/ :/:/' | sed 's/__utma=//g'

But I would spaces left, perhaps if you intend to parse - Laurence.Crist commented on October 3rd 19 at 04:30
October 3rd 19 at 04:20
Something like that is possible:

for line in $(cat log);
do
 ip=$(echo $line | grep -o "^[0-9]\+\.[0-9]\+\.[0-9]\+\.[0-9]\+")
 utma=$(echo $line |grep -o "__utma=\([0-9]\+\.\)\+[0-9]\+" | cut-d '=' -f2 )
 echo -en "$ip $utma\n"
done
October 3rd 19 at 04:22
cat log | awk -F"" '{print $1,$4}' | awk '{print $1":",$4}' | sed 's/"__utma=//g' | sed 's/;$//'

Find more questions by tags grepLinux