Tags: awk, delete, deletion, determine, field, file, filea, fileb, line, linesout, linux, number, ofanother, programming, script, sed, unix, value, values

SED or AWK script to delete a line if value in field 10 of filea is in fileb?

On Programmer » Unix & Linux

4,819 words with 5 Comments; publish: Tue, 29 Apr 2008 19:35:00 GMT; (200171.88, « »)

Hello all. I have a big file that I need to delete a number of lines

out of. I want to read the values that will determine deletion out of

another file, because there are multiples. I was trying to use grep

-vf fileb filea but its not going to work because the value that will

determine the delete is has to be based on a specific field in filea,

and I was getting false hits (thus deleting) when the same value was

conincidentally in other fields.

Can someone post an example of a sed or awk script to do this?

All Comments

Leave a comment...

  • 5 Comments
    • awk '

      BEGIN {

      while ((getline < "fileb") > 0)

      d[$0] = ""

      }

      ! ($10 in d)' < filea

      (use /usr/xpg4/bin/awk on Solaris)

      Stephane

      #1; Tue, 29 Apr 2008 19:37:00 GMT
    • > Hello all. I have a big file that I need to delete a number of lines

      > out of. I want to read the values that will determine deletion out of

      > another file, because there are multiples. I was trying to use grep

      > -vf fileb filea but its not going to work because the value that will

      > determine the delete is has to be based on a specific field in filea,

      > and I was getting false hits (thus deleting) when the same value was

      > conincidentally in other fields.

      So how about using:

      cut -f<fieldnumber> -s<separator> filea > /tmp/keystodelete

      possibly piping through 'sort -u' too, to generate a unique list of the

      field in filea that you want to use to delete records from fileb?

      Then you could use your grep command. If that won't do it, then maybe

      'comm' and 'join' will help you.

      Alexis

      #2; Tue, 29 Apr 2008 19:38:00 GMT
    • emebohw.unix-linux.todaysummary.com.netscape.net (sumGirl) wrote in message news:

      > Hello all. I have a big file that I need to delete a number of lines

      > out of. I want to read the values that will determine deletion out of

      > another file, because there are multiples. I was trying to use grep

      > -vf fileb filea but its not going to work because the value that will

      > determine the delete is has to be based on a specific field in filea,

      > and I was getting false hits (thus deleting) when the same value was

      > conincidentally in other fields.

      > Can someone post an example of a sed or awk script to do this?

      >

      step-1:

      sed -e 's:.*:/&/d' fileb > todelete

      step-2:

      sed -f todelete filea

      #3; Tue, 29 Apr 2008 19:39:00 GMT
    • emebohw.unix-linux.todaysummary.com.netscape.net (sumGirl) wrote in message news:<a5e13cff.0408060849.69da13cd.unix-linux.todaysummary.com.posting.

      google.com>...

      > Hello all. I have a big file that I need to delete a number of lines

      > out of. I want to read the values that will determine deletion out of

      > another file, because there are multiples. I was trying to use grep

      > -vf fileb filea but its not going to work because the value that will

      > determine the delete is has to be based on a specific field in filea,

      > and I was getting false hits (thus deleting) when the same value was

      > conincidentally in other fields.

      > Can someone post an example of a sed or awk script to do this?

      A ksh93 solution (untested):

      #!/usr/bin/ksh93

      DELIM="whatever your file delimiter character(s) are"

      X="field number from bigfile on which to base deletion test"

      while IFS="${DELIM}" read -r -A -- BIGFILE

      do

      while read -r -- DELETEME

      do

      [[ "_${BIGFILE[X]}" != "_${DELETEME}" ]] &&

      IFS="${DELIM}" print -r -- "${BIGFILE[*]}"

      done < fileb # file with deletion indicators

      done < filea # so called Big file

      # The output will be each field from the bigfile, separated

      # by the first character of the IFS variable.

      ---

      Dana French dfrench.unix-linux.todaysummary.com.mtxia.com

      Mt Xia Technical Consulting Group http://www.mtxia.com

      100% Spam Free Email http://www.ridmail.com

      MicroEmacs http://uemacs.tripod.com

      Korn Shell Web http://dfrench.tripod.com/kshweb.html

      #4; Tue, 29 Apr 2008 19:40:00 GMT
    • sharma__r.unix-linux.todaysummary.com.hotmail.com (rakesh sharma) wrote in message news:

      >

      > step-1:

      > sed -e 's:.*:/&/d' fileb > todelete

      > step-2:

      > sed -f todelete filea

      >

      oops! i misunderstood the specs. here's the corrected solution.

      sorry about that.

      step-1:

      sed -e '

      1i\

      s/[^ ][^ ]*/\\\

      &\\\

      /10

      s:.*:/\\n&\\n/d:

      $a\

      s/\\n//g

      ' fileb > todelete

      step-2:

      sed -f todelete filea > results

      #5; Tue, 29 Apr 2008 19:41:00 GMT