awk statement to validate and return fixed-structure data lines at start of fileawk + cut file between lines...

It took me a lot of time to make this, pls like. (YouTube Comments #1)

How to acknowledge an embarrassing job interview, now that I work directly with the interviewer?

How do creatures spend Hit Dice after a short rest (if they can do so)?

Why do neural networks need so many training examples to perform?

Why did Jodrell Bank assist the Soviet Union to collect data from their spacecraft in the mid 1960's?

What is the wife of a henpecked husband called?

How would an AI self awareness kill switch work?

When does coming up with an idea constitute sufficient contribution for authorship?

Can pricing be copyrighted?

Knowing when to use pictures over words

Cryptic with missing capitals

What do you call a fact that doesn't match the settings?

How to generate a matrix with certain conditions

If I delete my router's history can my ISP still provide it to my parents?

View Angle Calculation

How would one buy a used TIE Fighter or X-Wing?

Word or phrase for showing great skill at something without formal training in it

How experienced do I need to be to go on a photography workshop?

Finding radius of circle

Why avoid shared user accounts?

Using loops to create tables

Why did the villain in the first Men in Black movie care about Earth's Cockroaches?

Does my logo design convey the right feelings for a University Student's Council?

Slow moving projectiles from a hand-held weapon - how do they reach the target?



awk statement to validate and return fixed-structure data lines at start of file


awk + cut file between lines from ext ARGUsing awk to split text file every 10,000 linessed: how to replace line if found or append to end of file if not found?Replace Line Breaks on matching lines, using AWKawk print data in fixed columnAWK - print file and length in two lines in ENDHow to change a character on a certain line without delimiters with awkAdd spaces to the beginning of each matching lineText processing: transforming numbers in equivalent number of spaces in bashHow to parse multi-line log file in awk and output only single line with last known ip address













1















I have a file that starts like this



## CONFIG-PARAMS-START ##
##
## text1 text2 NNNNNNNNN (arbitrary_comment) ##
## text1 text2 NNNNNNNNN (arbitrary_comment) ##
## text1 text2 NNNNNNNNN (arbitrary_comment) ##
##
## CONFIG-PARAMS-END ##
<arbitrary rest of file>


Output:

I'd like to validate the file with awk, to check that it starts this way.



If yes, output just the data lines (not the start/end, or "bare" lines, or anything after this section), and if no, return a nonzero rc [$?] or some other easily testable condition such as [empty string].



File spec:

In modern (PRCE) regex terms, the data lines format is:



^##[[:space:]]*                    - starts with ## and optional spaces
([a-zA-Z0-9_-]+.)+) - >=1 repetition of [text_string][dot] (no spaces)
[[:space:]]+ - spaces
([^[:space:]]+) - block of non-spaces
[[:space:]]+ - spaces
([0-9]+) - block of digits
[[:space:]] - spaces
(.* - '(' + any text
##[[:space:]]*$ - 2 hashes, optional spaces + line end


( so a typical line might be ## abc.3ef. w;4o8c-uy3tu!ae 9938 (good luck!)##  )



There mustn't be any other lines (including empty/whitespace lines) before the first line, or anywhere else in the data block. Within each line, consecutive white space effectively acts as a single delimiter. White space after the first ## and before+after the last ## are all optional. There will typically be <15 lines in the section so size/speed/efficiency will be negligible considerations.



(The greedy capture on the 2nd last line isn't an issue, it'll backtrack minimally to match '##' in the final line)



Compatibility:
Wide compatibility is important, as the code will eventually need to be runnable on default/standard builds of different Linux, FreeBSD + other BSDs, maybe even other modern *nix platforms. (It's part of a patch for a widely used open-source package). Perhaps basic POSIX would provide a level field rather than assuming only some specific awk variant? Maintainability/ease of understanding is also useful for the same reason. Hoping greatly to avoid perl ;-)



I haven't quite got the hang of using awk for this sort of forward-and-backward referencing and checking, and even less an idea about managing compatibility / slight differences in implementations.



Awk skills would be appreciated to get a working version of this snippet!










share|improve this question





























    1















    I have a file that starts like this



    ## CONFIG-PARAMS-START ##
    ##
    ## text1 text2 NNNNNNNNN (arbitrary_comment) ##
    ## text1 text2 NNNNNNNNN (arbitrary_comment) ##
    ## text1 text2 NNNNNNNNN (arbitrary_comment) ##
    ##
    ## CONFIG-PARAMS-END ##
    <arbitrary rest of file>


    Output:

    I'd like to validate the file with awk, to check that it starts this way.



    If yes, output just the data lines (not the start/end, or "bare" lines, or anything after this section), and if no, return a nonzero rc [$?] or some other easily testable condition such as [empty string].



    File spec:

    In modern (PRCE) regex terms, the data lines format is:



    ^##[[:space:]]*                    - starts with ## and optional spaces
    ([a-zA-Z0-9_-]+.)+) - >=1 repetition of [text_string][dot] (no spaces)
    [[:space:]]+ - spaces
    ([^[:space:]]+) - block of non-spaces
    [[:space:]]+ - spaces
    ([0-9]+) - block of digits
    [[:space:]] - spaces
    (.* - '(' + any text
    ##[[:space:]]*$ - 2 hashes, optional spaces + line end


    ( so a typical line might be ## abc.3ef. w;4o8c-uy3tu!ae 9938 (good luck!)##  )



    There mustn't be any other lines (including empty/whitespace lines) before the first line, or anywhere else in the data block. Within each line, consecutive white space effectively acts as a single delimiter. White space after the first ## and before+after the last ## are all optional. There will typically be <15 lines in the section so size/speed/efficiency will be negligible considerations.



    (The greedy capture on the 2nd last line isn't an issue, it'll backtrack minimally to match '##' in the final line)



    Compatibility:
    Wide compatibility is important, as the code will eventually need to be runnable on default/standard builds of different Linux, FreeBSD + other BSDs, maybe even other modern *nix platforms. (It's part of a patch for a widely used open-source package). Perhaps basic POSIX would provide a level field rather than assuming only some specific awk variant? Maintainability/ease of understanding is also useful for the same reason. Hoping greatly to avoid perl ;-)



    I haven't quite got the hang of using awk for this sort of forward-and-backward referencing and checking, and even less an idea about managing compatibility / slight differences in implementations.



    Awk skills would be appreciated to get a working version of this snippet!










    share|improve this question



























      1












      1








      1








      I have a file that starts like this



      ## CONFIG-PARAMS-START ##
      ##
      ## text1 text2 NNNNNNNNN (arbitrary_comment) ##
      ## text1 text2 NNNNNNNNN (arbitrary_comment) ##
      ## text1 text2 NNNNNNNNN (arbitrary_comment) ##
      ##
      ## CONFIG-PARAMS-END ##
      <arbitrary rest of file>


      Output:

      I'd like to validate the file with awk, to check that it starts this way.



      If yes, output just the data lines (not the start/end, or "bare" lines, or anything after this section), and if no, return a nonzero rc [$?] or some other easily testable condition such as [empty string].



      File spec:

      In modern (PRCE) regex terms, the data lines format is:



      ^##[[:space:]]*                    - starts with ## and optional spaces
      ([a-zA-Z0-9_-]+.)+) - >=1 repetition of [text_string][dot] (no spaces)
      [[:space:]]+ - spaces
      ([^[:space:]]+) - block of non-spaces
      [[:space:]]+ - spaces
      ([0-9]+) - block of digits
      [[:space:]] - spaces
      (.* - '(' + any text
      ##[[:space:]]*$ - 2 hashes, optional spaces + line end


      ( so a typical line might be ## abc.3ef. w;4o8c-uy3tu!ae 9938 (good luck!)##  )



      There mustn't be any other lines (including empty/whitespace lines) before the first line, or anywhere else in the data block. Within each line, consecutive white space effectively acts as a single delimiter. White space after the first ## and before+after the last ## are all optional. There will typically be <15 lines in the section so size/speed/efficiency will be negligible considerations.



      (The greedy capture on the 2nd last line isn't an issue, it'll backtrack minimally to match '##' in the final line)



      Compatibility:
      Wide compatibility is important, as the code will eventually need to be runnable on default/standard builds of different Linux, FreeBSD + other BSDs, maybe even other modern *nix platforms. (It's part of a patch for a widely used open-source package). Perhaps basic POSIX would provide a level field rather than assuming only some specific awk variant? Maintainability/ease of understanding is also useful for the same reason. Hoping greatly to avoid perl ;-)



      I haven't quite got the hang of using awk for this sort of forward-and-backward referencing and checking, and even less an idea about managing compatibility / slight differences in implementations.



      Awk skills would be appreciated to get a working version of this snippet!










      share|improve this question
















      I have a file that starts like this



      ## CONFIG-PARAMS-START ##
      ##
      ## text1 text2 NNNNNNNNN (arbitrary_comment) ##
      ## text1 text2 NNNNNNNNN (arbitrary_comment) ##
      ## text1 text2 NNNNNNNNN (arbitrary_comment) ##
      ##
      ## CONFIG-PARAMS-END ##
      <arbitrary rest of file>


      Output:

      I'd like to validate the file with awk, to check that it starts this way.



      If yes, output just the data lines (not the start/end, or "bare" lines, or anything after this section), and if no, return a nonzero rc [$?] or some other easily testable condition such as [empty string].



      File spec:

      In modern (PRCE) regex terms, the data lines format is:



      ^##[[:space:]]*                    - starts with ## and optional spaces
      ([a-zA-Z0-9_-]+.)+) - >=1 repetition of [text_string][dot] (no spaces)
      [[:space:]]+ - spaces
      ([^[:space:]]+) - block of non-spaces
      [[:space:]]+ - spaces
      ([0-9]+) - block of digits
      [[:space:]] - spaces
      (.* - '(' + any text
      ##[[:space:]]*$ - 2 hashes, optional spaces + line end


      ( so a typical line might be ## abc.3ef. w;4o8c-uy3tu!ae 9938 (good luck!)##  )



      There mustn't be any other lines (including empty/whitespace lines) before the first line, or anywhere else in the data block. Within each line, consecutive white space effectively acts as a single delimiter. White space after the first ## and before+after the last ## are all optional. There will typically be <15 lines in the section so size/speed/efficiency will be negligible considerations.



      (The greedy capture on the 2nd last line isn't an issue, it'll backtrack minimally to match '##' in the final line)



      Compatibility:
      Wide compatibility is important, as the code will eventually need to be runnable on default/standard builds of different Linux, FreeBSD + other BSDs, maybe even other modern *nix platforms. (It's part of a patch for a widely used open-source package). Perhaps basic POSIX would provide a level field rather than assuming only some specific awk variant? Maintainability/ease of understanding is also useful for the same reason. Hoping greatly to avoid perl ;-)



      I haven't quite got the hang of using awk for this sort of forward-and-backward referencing and checking, and even less an idea about managing compatibility / slight differences in implementations.



      Awk skills would be appreciated to get a working version of this snippet!







      compatibility awk






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited 11 mins ago







      Stilez

















      asked 17 mins ago









      StilezStilez

      77711022




      77711022






















          0






          active

          oldest

          votes











          Your Answer








          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "3"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f1410816%2fawk-statement-to-validate-and-return-fixed-structure-data-lines-at-start-of-file%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          0






          active

          oldest

          votes








          0






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes
















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Super User!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f1410816%2fawk-statement-to-validate-and-return-fixed-structure-data-lines-at-start-of-file%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Cannot install PyQt5 The Next CEO of Stack OverflowCannot install tcpreplay 3.4.4cannot...

          Kapp-Putsch Acontecimentos | Outros artigos | Menu de navegação

          Why did early computer designers eschew integers? The Next CEO of Stack OverflowWhat register...