Excel VBA HTML Nested QuerySelector












2














Consider this extract of an html page:



<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Document</title>
</head>
<body>
<div class="BoxBody">
<span class="txt">20 Records found. </span>
<p style="text-align: right;"><span class="txt">[First/Previous] &nbsp;1&nbsp;, <a class="page" href="javascript:paginacao('paginar','2');" title="Go to page 2">2</a> [<a class="page" title="Next page" href="javascript:paginacao('paginar','next');">Next</a>/<a class="page" title="Last page" href="javascript:paginacao('paginar','last');">Last</a>]</span></p>
<br>
<span class="txt">25 Records found. </span>
<p style="text-align: right;"><span class="txt">[First/Previous] &nbsp;1&nbsp;, <a class="page" href="javascript:paginacao('paginar2','2');" title="Go to page 2">2</a> [<a class="page" title="Next page" href="javascript:paginacao('paginar2','next');">Next</a>/<a class="page" title="Last page" href="javascript:paginacao('paginar2','last');">Last</a>]</span></p>
</div>
</body>
</html>


I am trying to get the anchor tag that has the "next" page href (if it has one).



I tried this in the console using Firefox and it works:



document.querySelector(".BoxBody > p:nth-child(2) > span:nth-child(1)").querySelector("a[title='Next page']")


I put up a sample VBA code using querySelector as well, but it fails with Invalid argument.



Sub test()

Dim oFSO As Object, paginator As Object
Dim oFS As Object, sText As String

Set oFSO = CreateObject("Scripting.FileSystemObject")
Set oFS = oFSO.OpenTextFile(ThisWorkbook.Path & "example.html")

Do Until oFS.AtEndOfStream
sText = oFS.ReadAll()
Loop


Dim html As HTMLDocument, html2 As Object
Set html = New HTMLDocument
Set html2 = html
html2.Write sText

Set paginator = html.querySelector(".BoxBody > p:nth-child(2) > span:nth-child(1)").querySelector("a[title='Next page']")

End Sub


What is causing this? The p:nth-child(2) identifier?
How should I go to extract that element using VBA?










share|improve this question





























    2














    Consider this extract of an html page:



    <!DOCTYPE html>
    <html lang="en">
    <head>
    <meta charset="UTF-8">
    <title>Document</title>
    </head>
    <body>
    <div class="BoxBody">
    <span class="txt">20 Records found. </span>
    <p style="text-align: right;"><span class="txt">[First/Previous] &nbsp;1&nbsp;, <a class="page" href="javascript:paginacao('paginar','2');" title="Go to page 2">2</a> [<a class="page" title="Next page" href="javascript:paginacao('paginar','next');">Next</a>/<a class="page" title="Last page" href="javascript:paginacao('paginar','last');">Last</a>]</span></p>
    <br>
    <span class="txt">25 Records found. </span>
    <p style="text-align: right;"><span class="txt">[First/Previous] &nbsp;1&nbsp;, <a class="page" href="javascript:paginacao('paginar2','2');" title="Go to page 2">2</a> [<a class="page" title="Next page" href="javascript:paginacao('paginar2','next');">Next</a>/<a class="page" title="Last page" href="javascript:paginacao('paginar2','last');">Last</a>]</span></p>
    </div>
    </body>
    </html>


    I am trying to get the anchor tag that has the "next" page href (if it has one).



    I tried this in the console using Firefox and it works:



    document.querySelector(".BoxBody > p:nth-child(2) > span:nth-child(1)").querySelector("a[title='Next page']")


    I put up a sample VBA code using querySelector as well, but it fails with Invalid argument.



    Sub test()

    Dim oFSO As Object, paginator As Object
    Dim oFS As Object, sText As String

    Set oFSO = CreateObject("Scripting.FileSystemObject")
    Set oFS = oFSO.OpenTextFile(ThisWorkbook.Path & "example.html")

    Do Until oFS.AtEndOfStream
    sText = oFS.ReadAll()
    Loop


    Dim html As HTMLDocument, html2 As Object
    Set html = New HTMLDocument
    Set html2 = html
    html2.Write sText

    Set paginator = html.querySelector(".BoxBody > p:nth-child(2) > span:nth-child(1)").querySelector("a[title='Next page']")

    End Sub


    What is causing this? The p:nth-child(2) identifier?
    How should I go to extract that element using VBA?










    share|improve this question



























      2












      2








      2







      Consider this extract of an html page:



      <!DOCTYPE html>
      <html lang="en">
      <head>
      <meta charset="UTF-8">
      <title>Document</title>
      </head>
      <body>
      <div class="BoxBody">
      <span class="txt">20 Records found. </span>
      <p style="text-align: right;"><span class="txt">[First/Previous] &nbsp;1&nbsp;, <a class="page" href="javascript:paginacao('paginar','2');" title="Go to page 2">2</a> [<a class="page" title="Next page" href="javascript:paginacao('paginar','next');">Next</a>/<a class="page" title="Last page" href="javascript:paginacao('paginar','last');">Last</a>]</span></p>
      <br>
      <span class="txt">25 Records found. </span>
      <p style="text-align: right;"><span class="txt">[First/Previous] &nbsp;1&nbsp;, <a class="page" href="javascript:paginacao('paginar2','2');" title="Go to page 2">2</a> [<a class="page" title="Next page" href="javascript:paginacao('paginar2','next');">Next</a>/<a class="page" title="Last page" href="javascript:paginacao('paginar2','last');">Last</a>]</span></p>
      </div>
      </body>
      </html>


      I am trying to get the anchor tag that has the "next" page href (if it has one).



      I tried this in the console using Firefox and it works:



      document.querySelector(".BoxBody > p:nth-child(2) > span:nth-child(1)").querySelector("a[title='Next page']")


      I put up a sample VBA code using querySelector as well, but it fails with Invalid argument.



      Sub test()

      Dim oFSO As Object, paginator As Object
      Dim oFS As Object, sText As String

      Set oFSO = CreateObject("Scripting.FileSystemObject")
      Set oFS = oFSO.OpenTextFile(ThisWorkbook.Path & "example.html")

      Do Until oFS.AtEndOfStream
      sText = oFS.ReadAll()
      Loop


      Dim html As HTMLDocument, html2 As Object
      Set html = New HTMLDocument
      Set html2 = html
      html2.Write sText

      Set paginator = html.querySelector(".BoxBody > p:nth-child(2) > span:nth-child(1)").querySelector("a[title='Next page']")

      End Sub


      What is causing this? The p:nth-child(2) identifier?
      How should I go to extract that element using VBA?










      share|improve this question















      Consider this extract of an html page:



      <!DOCTYPE html>
      <html lang="en">
      <head>
      <meta charset="UTF-8">
      <title>Document</title>
      </head>
      <body>
      <div class="BoxBody">
      <span class="txt">20 Records found. </span>
      <p style="text-align: right;"><span class="txt">[First/Previous] &nbsp;1&nbsp;, <a class="page" href="javascript:paginacao('paginar','2');" title="Go to page 2">2</a> [<a class="page" title="Next page" href="javascript:paginacao('paginar','next');">Next</a>/<a class="page" title="Last page" href="javascript:paginacao('paginar','last');">Last</a>]</span></p>
      <br>
      <span class="txt">25 Records found. </span>
      <p style="text-align: right;"><span class="txt">[First/Previous] &nbsp;1&nbsp;, <a class="page" href="javascript:paginacao('paginar2','2');" title="Go to page 2">2</a> [<a class="page" title="Next page" href="javascript:paginacao('paginar2','next');">Next</a>/<a class="page" title="Last page" href="javascript:paginacao('paginar2','last');">Last</a>]</span></p>
      </div>
      </body>
      </html>


      I am trying to get the anchor tag that has the "next" page href (if it has one).



      I tried this in the console using Firefox and it works:



      document.querySelector(".BoxBody > p:nth-child(2) > span:nth-child(1)").querySelector("a[title='Next page']")


      I put up a sample VBA code using querySelector as well, but it fails with Invalid argument.



      Sub test()

      Dim oFSO As Object, paginator As Object
      Dim oFS As Object, sText As String

      Set oFSO = CreateObject("Scripting.FileSystemObject")
      Set oFS = oFSO.OpenTextFile(ThisWorkbook.Path & "example.html")

      Do Until oFS.AtEndOfStream
      sText = oFS.ReadAll()
      Loop


      Dim html As HTMLDocument, html2 As Object
      Set html = New HTMLDocument
      Set html2 = html
      html2.Write sText

      Set paginator = html.querySelector(".BoxBody > p:nth-child(2) > span:nth-child(1)").querySelector("a[title='Next page']")

      End Sub


      What is causing this? The p:nth-child(2) identifier?
      How should I go to extract that element using VBA?







      html excel vba web-scraping css-selectors






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 13 at 5:42









      BoltClock

      514k12711521191




      514k12711521191










      asked Nov 12 at 16:38









      drec4s

      1,6062621




      1,6062621
























          1 Answer
          1






          active

          oldest

          votes


















          3














          nth-child(2) is not supported in VBA and is indeed causing the error message. You can't use :nth-child() or :nth-of-type(). There is very little implemented in libraries available to you that deal with pseudo-classes. You can use first-child interestingly. You will also find you are limited on which objects you can chain querySelector on.



          Dim ele As Object, iText As String
          Set ele = html.querySelector(".BoxBody > p > span:first-child > a[title='Next page']")

          On Error Resume Next
          iText = ele.href
          On Error GoTo 0

          If iText = vbNullString Then '<== This assumes that the href has a value otherwise use an On Error GoTo which will then handle the error and print "no href"
          Debug.Print "No href"
          Else
          Debug.Print "href"
          End If





          share|improve this answer























          • That was my first solution, but since there are two similar paginated tables in the page (with that same title attribute), I really need to check if that element exists inside that .BoxBody > p:nth-child(2) > span:nth-child(1) span:nth-child(1) element..
            – drec4s
            Nov 12 at 17:00






          • 1




            No, but I can edit my question to exemplify that.
            – drec4s
            Nov 12 at 17:05










          • Ok. If there is enough to demonstrate the choice that must be made .
            – QHarr
            Nov 12 at 17:05










          • No, I want one match only (whether the 'next' button has an href, or not)
            – drec4s
            Nov 12 at 17:14






          • 1




            Well, that first-child is a nice catch, thanks!
            – drec4s
            Nov 12 at 17:29











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53266480%2fexcel-vba-html-nested-queryselector%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          3














          nth-child(2) is not supported in VBA and is indeed causing the error message. You can't use :nth-child() or :nth-of-type(). There is very little implemented in libraries available to you that deal with pseudo-classes. You can use first-child interestingly. You will also find you are limited on which objects you can chain querySelector on.



          Dim ele As Object, iText As String
          Set ele = html.querySelector(".BoxBody > p > span:first-child > a[title='Next page']")

          On Error Resume Next
          iText = ele.href
          On Error GoTo 0

          If iText = vbNullString Then '<== This assumes that the href has a value otherwise use an On Error GoTo which will then handle the error and print "no href"
          Debug.Print "No href"
          Else
          Debug.Print "href"
          End If





          share|improve this answer























          • That was my first solution, but since there are two similar paginated tables in the page (with that same title attribute), I really need to check if that element exists inside that .BoxBody > p:nth-child(2) > span:nth-child(1) span:nth-child(1) element..
            – drec4s
            Nov 12 at 17:00






          • 1




            No, but I can edit my question to exemplify that.
            – drec4s
            Nov 12 at 17:05










          • Ok. If there is enough to demonstrate the choice that must be made .
            – QHarr
            Nov 12 at 17:05










          • No, I want one match only (whether the 'next' button has an href, or not)
            – drec4s
            Nov 12 at 17:14






          • 1




            Well, that first-child is a nice catch, thanks!
            – drec4s
            Nov 12 at 17:29
















          3














          nth-child(2) is not supported in VBA and is indeed causing the error message. You can't use :nth-child() or :nth-of-type(). There is very little implemented in libraries available to you that deal with pseudo-classes. You can use first-child interestingly. You will also find you are limited on which objects you can chain querySelector on.



          Dim ele As Object, iText As String
          Set ele = html.querySelector(".BoxBody > p > span:first-child > a[title='Next page']")

          On Error Resume Next
          iText = ele.href
          On Error GoTo 0

          If iText = vbNullString Then '<== This assumes that the href has a value otherwise use an On Error GoTo which will then handle the error and print "no href"
          Debug.Print "No href"
          Else
          Debug.Print "href"
          End If





          share|improve this answer























          • That was my first solution, but since there are two similar paginated tables in the page (with that same title attribute), I really need to check if that element exists inside that .BoxBody > p:nth-child(2) > span:nth-child(1) span:nth-child(1) element..
            – drec4s
            Nov 12 at 17:00






          • 1




            No, but I can edit my question to exemplify that.
            – drec4s
            Nov 12 at 17:05










          • Ok. If there is enough to demonstrate the choice that must be made .
            – QHarr
            Nov 12 at 17:05










          • No, I want one match only (whether the 'next' button has an href, or not)
            – drec4s
            Nov 12 at 17:14






          • 1




            Well, that first-child is a nice catch, thanks!
            – drec4s
            Nov 12 at 17:29














          3












          3








          3






          nth-child(2) is not supported in VBA and is indeed causing the error message. You can't use :nth-child() or :nth-of-type(). There is very little implemented in libraries available to you that deal with pseudo-classes. You can use first-child interestingly. You will also find you are limited on which objects you can chain querySelector on.



          Dim ele As Object, iText As String
          Set ele = html.querySelector(".BoxBody > p > span:first-child > a[title='Next page']")

          On Error Resume Next
          iText = ele.href
          On Error GoTo 0

          If iText = vbNullString Then '<== This assumes that the href has a value otherwise use an On Error GoTo which will then handle the error and print "no href"
          Debug.Print "No href"
          Else
          Debug.Print "href"
          End If





          share|improve this answer














          nth-child(2) is not supported in VBA and is indeed causing the error message. You can't use :nth-child() or :nth-of-type(). There is very little implemented in libraries available to you that deal with pseudo-classes. You can use first-child interestingly. You will also find you are limited on which objects you can chain querySelector on.



          Dim ele As Object, iText As String
          Set ele = html.querySelector(".BoxBody > p > span:first-child > a[title='Next page']")

          On Error Resume Next
          iText = ele.href
          On Error GoTo 0

          If iText = vbNullString Then '<== This assumes that the href has a value otherwise use an On Error GoTo which will then handle the error and print "no href"
          Debug.Print "No href"
          Else
          Debug.Print "href"
          End If






          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Nov 13 at 5:42









          BoltClock

          514k12711521191




          514k12711521191










          answered Nov 12 at 16:56









          QHarr

          29.8k81841




          29.8k81841












          • That was my first solution, but since there are two similar paginated tables in the page (with that same title attribute), I really need to check if that element exists inside that .BoxBody > p:nth-child(2) > span:nth-child(1) span:nth-child(1) element..
            – drec4s
            Nov 12 at 17:00






          • 1




            No, but I can edit my question to exemplify that.
            – drec4s
            Nov 12 at 17:05










          • Ok. If there is enough to demonstrate the choice that must be made .
            – QHarr
            Nov 12 at 17:05










          • No, I want one match only (whether the 'next' button has an href, or not)
            – drec4s
            Nov 12 at 17:14






          • 1




            Well, that first-child is a nice catch, thanks!
            – drec4s
            Nov 12 at 17:29


















          • That was my first solution, but since there are two similar paginated tables in the page (with that same title attribute), I really need to check if that element exists inside that .BoxBody > p:nth-child(2) > span:nth-child(1) span:nth-child(1) element..
            – drec4s
            Nov 12 at 17:00






          • 1




            No, but I can edit my question to exemplify that.
            – drec4s
            Nov 12 at 17:05










          • Ok. If there is enough to demonstrate the choice that must be made .
            – QHarr
            Nov 12 at 17:05










          • No, I want one match only (whether the 'next' button has an href, or not)
            – drec4s
            Nov 12 at 17:14






          • 1




            Well, that first-child is a nice catch, thanks!
            – drec4s
            Nov 12 at 17:29
















          That was my first solution, but since there are two similar paginated tables in the page (with that same title attribute), I really need to check if that element exists inside that .BoxBody > p:nth-child(2) > span:nth-child(1) span:nth-child(1) element..
          – drec4s
          Nov 12 at 17:00




          That was my first solution, but since there are two similar paginated tables in the page (with that same title attribute), I really need to check if that element exists inside that .BoxBody > p:nth-child(2) > span:nth-child(1) span:nth-child(1) element..
          – drec4s
          Nov 12 at 17:00




          1




          1




          No, but I can edit my question to exemplify that.
          – drec4s
          Nov 12 at 17:05




          No, but I can edit my question to exemplify that.
          – drec4s
          Nov 12 at 17:05












          Ok. If there is enough to demonstrate the choice that must be made .
          – QHarr
          Nov 12 at 17:05




          Ok. If there is enough to demonstrate the choice that must be made .
          – QHarr
          Nov 12 at 17:05












          No, I want one match only (whether the 'next' button has an href, or not)
          – drec4s
          Nov 12 at 17:14




          No, I want one match only (whether the 'next' button has an href, or not)
          – drec4s
          Nov 12 at 17:14




          1




          1




          Well, that first-child is a nice catch, thanks!
          – drec4s
          Nov 12 at 17:29




          Well, that first-child is a nice catch, thanks!
          – drec4s
          Nov 12 at 17:29


















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.





          Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


          Please pay close attention to the following guidance:


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53266480%2fexcel-vba-html-nested-queryselector%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Bressuire

          Vorschmack

          Quarantine