Scrap the web page using jsoup
up vote
2
down vote
favorite
I need to scrap the postcode from below html code by using the jsoup. I only need postcode which is part of href
attribute of a
tag, called W2:
<a href="/properties-for-sale/w2/chpk3848653" class="property_photo_holder" style="backgroundimage:url(https://assets.foxtons.co.uk/w/480/1523289105/chpk3848653-23.jpg)"></a>
This is html code:
</div>
<div id="property_1062067" class="property_summary">
<h6><a href="/properties-for-sale/w2/chpk3848653">Lancaster Gate, <span class="property_address_location_name">Bayswater,</span> W2</a></h6>
Can anyone help ?
Thank you.
java html parsing web-scraping jsoup
|
show 3 more comments
up vote
2
down vote
favorite
I need to scrap the postcode from below html code by using the jsoup. I only need postcode which is part of href
attribute of a
tag, called W2:
<a href="/properties-for-sale/w2/chpk3848653" class="property_photo_holder" style="backgroundimage:url(https://assets.foxtons.co.uk/w/480/1523289105/chpk3848653-23.jpg)"></a>
This is html code:
</div>
<div id="property_1062067" class="property_summary">
<h6><a href="/properties-for-sale/w2/chpk3848653">Lancaster Gate, <span class="property_address_location_name">Bayswater,</span> W2</a></h6>
Can anyone help ?
Thank you.
java html parsing web-scraping jsoup
1
What do you mean by "I only need postcode which is W2" ? Also, may you post something you tried?
– Subhasish Bhattacharjee
Nov 11 at 9:18
I just tried to show what data exactly I want to scrap. Please see the below
– Hakan
Nov 11 at 9:48
>Bayswater,</span> W2</a></h6>
– Hakan
Nov 11 at 9:48
This is my code which I tried to scrap
– Hakan
Nov 11 at 9:51
Elements postcodes = doc.select("span.property_address_location_name"); for (Element postcode : postcodes) { System.out.println(postcode.text()); }
– Hakan
Nov 11 at 9:51
|
show 3 more comments
up vote
2
down vote
favorite
up vote
2
down vote
favorite
I need to scrap the postcode from below html code by using the jsoup. I only need postcode which is part of href
attribute of a
tag, called W2:
<a href="/properties-for-sale/w2/chpk3848653" class="property_photo_holder" style="backgroundimage:url(https://assets.foxtons.co.uk/w/480/1523289105/chpk3848653-23.jpg)"></a>
This is html code:
</div>
<div id="property_1062067" class="property_summary">
<h6><a href="/properties-for-sale/w2/chpk3848653">Lancaster Gate, <span class="property_address_location_name">Bayswater,</span> W2</a></h6>
Can anyone help ?
Thank you.
java html parsing web-scraping jsoup
I need to scrap the postcode from below html code by using the jsoup. I only need postcode which is part of href
attribute of a
tag, called W2:
<a href="/properties-for-sale/w2/chpk3848653" class="property_photo_holder" style="backgroundimage:url(https://assets.foxtons.co.uk/w/480/1523289105/chpk3848653-23.jpg)"></a>
This is html code:
</div>
<div id="property_1062067" class="property_summary">
<h6><a href="/properties-for-sale/w2/chpk3848653">Lancaster Gate, <span class="property_address_location_name">Bayswater,</span> W2</a></h6>
Can anyone help ?
Thank you.
java html parsing web-scraping jsoup
java html parsing web-scraping jsoup
edited Nov 11 at 11:25
Dinko Pehar
574323
574323
asked Nov 11 at 8:46
Hakan
113
113
1
What do you mean by "I only need postcode which is W2" ? Also, may you post something you tried?
– Subhasish Bhattacharjee
Nov 11 at 9:18
I just tried to show what data exactly I want to scrap. Please see the below
– Hakan
Nov 11 at 9:48
>Bayswater,</span> W2</a></h6>
– Hakan
Nov 11 at 9:48
This is my code which I tried to scrap
– Hakan
Nov 11 at 9:51
Elements postcodes = doc.select("span.property_address_location_name"); for (Element postcode : postcodes) { System.out.println(postcode.text()); }
– Hakan
Nov 11 at 9:51
|
show 3 more comments
1
What do you mean by "I only need postcode which is W2" ? Also, may you post something you tried?
– Subhasish Bhattacharjee
Nov 11 at 9:18
I just tried to show what data exactly I want to scrap. Please see the below
– Hakan
Nov 11 at 9:48
>Bayswater,</span> W2</a></h6>
– Hakan
Nov 11 at 9:48
This is my code which I tried to scrap
– Hakan
Nov 11 at 9:51
Elements postcodes = doc.select("span.property_address_location_name"); for (Element postcode : postcodes) { System.out.println(postcode.text()); }
– Hakan
Nov 11 at 9:51
1
1
What do you mean by "I only need postcode which is W2" ? Also, may you post something you tried?
– Subhasish Bhattacharjee
Nov 11 at 9:18
What do you mean by "I only need postcode which is W2" ? Also, may you post something you tried?
– Subhasish Bhattacharjee
Nov 11 at 9:18
I just tried to show what data exactly I want to scrap. Please see the below
– Hakan
Nov 11 at 9:48
I just tried to show what data exactly I want to scrap. Please see the below
– Hakan
Nov 11 at 9:48
>Bayswater,</span> W2</a></h6>
– Hakan
Nov 11 at 9:48
>Bayswater,</span> W2</a></h6>
– Hakan
Nov 11 at 9:48
This is my code which I tried to scrap
– Hakan
Nov 11 at 9:51
This is my code which I tried to scrap
– Hakan
Nov 11 at 9:51
Elements postcodes = doc.select("span.property_address_location_name"); for (Element postcode : postcodes) { System.out.println(postcode.text()); }
– Hakan
Nov 11 at 9:51
Elements postcodes = doc.select("span.property_address_location_name"); for (Element postcode : postcodes) { System.out.println(postcode.text()); }
– Hakan
Nov 11 at 9:51
|
show 3 more comments
1 Answer
1
active
oldest
votes
up vote
0
down vote
You can use JSOUP for that, you just need to retrieve the href attribute value as it follows:
Document document = Jsoup.connect(URL).userAgent("Mozilla/5.0").get();
Elements elements = document.select("a");
String href = elements.attr("href");
Now that you have the href attribute as a String, you need to apply a RegEx (Regular Expression) to get the field you want, in this case, the Postal Code contained in: "/properties-for-sale/w2/chpk3848653". To do that you will need to:
String regex = "[a-zA-Z0-9]{11}";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(href);
String postalCode = matcher.find().group(0);
That's all, if you need anything else feel free to ask! Hope this helped you!
Something is wrong with this code. Thanks for anyway
– Hakan
Nov 13 at 19:07
@Hakan no problem! ask me if you need anything else, that was just a sample code as guide! +1 if you found it useful!
– alvarobartt
Nov 14 at 8:35
This is the code how I scraped all other attributes...etc.
– Hakan
Nov 15 at 10:52
//Get the location of property Elements locations = items.get(i).getElementsByTag("h6"); //Get the post code of property Elements postcodes = items.get(i).getElementsByTag("h6.a[href]"); //Get the longitude Elements longitude = items.get(i).select("div");
– Hakan
Nov 15 at 10:52
foxtons.co.uk/… This is the link to web scraping.
– Hakan
Nov 15 at 10:53
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
You can use JSOUP for that, you just need to retrieve the href attribute value as it follows:
Document document = Jsoup.connect(URL).userAgent("Mozilla/5.0").get();
Elements elements = document.select("a");
String href = elements.attr("href");
Now that you have the href attribute as a String, you need to apply a RegEx (Regular Expression) to get the field you want, in this case, the Postal Code contained in: "/properties-for-sale/w2/chpk3848653". To do that you will need to:
String regex = "[a-zA-Z0-9]{11}";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(href);
String postalCode = matcher.find().group(0);
That's all, if you need anything else feel free to ask! Hope this helped you!
Something is wrong with this code. Thanks for anyway
– Hakan
Nov 13 at 19:07
@Hakan no problem! ask me if you need anything else, that was just a sample code as guide! +1 if you found it useful!
– alvarobartt
Nov 14 at 8:35
This is the code how I scraped all other attributes...etc.
– Hakan
Nov 15 at 10:52
//Get the location of property Elements locations = items.get(i).getElementsByTag("h6"); //Get the post code of property Elements postcodes = items.get(i).getElementsByTag("h6.a[href]"); //Get the longitude Elements longitude = items.get(i).select("div");
– Hakan
Nov 15 at 10:52
foxtons.co.uk/… This is the link to web scraping.
– Hakan
Nov 15 at 10:53
add a comment |
up vote
0
down vote
You can use JSOUP for that, you just need to retrieve the href attribute value as it follows:
Document document = Jsoup.connect(URL).userAgent("Mozilla/5.0").get();
Elements elements = document.select("a");
String href = elements.attr("href");
Now that you have the href attribute as a String, you need to apply a RegEx (Regular Expression) to get the field you want, in this case, the Postal Code contained in: "/properties-for-sale/w2/chpk3848653". To do that you will need to:
String regex = "[a-zA-Z0-9]{11}";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(href);
String postalCode = matcher.find().group(0);
That's all, if you need anything else feel free to ask! Hope this helped you!
Something is wrong with this code. Thanks for anyway
– Hakan
Nov 13 at 19:07
@Hakan no problem! ask me if you need anything else, that was just a sample code as guide! +1 if you found it useful!
– alvarobartt
Nov 14 at 8:35
This is the code how I scraped all other attributes...etc.
– Hakan
Nov 15 at 10:52
//Get the location of property Elements locations = items.get(i).getElementsByTag("h6"); //Get the post code of property Elements postcodes = items.get(i).getElementsByTag("h6.a[href]"); //Get the longitude Elements longitude = items.get(i).select("div");
– Hakan
Nov 15 at 10:52
foxtons.co.uk/… This is the link to web scraping.
– Hakan
Nov 15 at 10:53
add a comment |
up vote
0
down vote
up vote
0
down vote
You can use JSOUP for that, you just need to retrieve the href attribute value as it follows:
Document document = Jsoup.connect(URL).userAgent("Mozilla/5.0").get();
Elements elements = document.select("a");
String href = elements.attr("href");
Now that you have the href attribute as a String, you need to apply a RegEx (Regular Expression) to get the field you want, in this case, the Postal Code contained in: "/properties-for-sale/w2/chpk3848653". To do that you will need to:
String regex = "[a-zA-Z0-9]{11}";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(href);
String postalCode = matcher.find().group(0);
That's all, if you need anything else feel free to ask! Hope this helped you!
You can use JSOUP for that, you just need to retrieve the href attribute value as it follows:
Document document = Jsoup.connect(URL).userAgent("Mozilla/5.0").get();
Elements elements = document.select("a");
String href = elements.attr("href");
Now that you have the href attribute as a String, you need to apply a RegEx (Regular Expression) to get the field you want, in this case, the Postal Code contained in: "/properties-for-sale/w2/chpk3848653". To do that you will need to:
String regex = "[a-zA-Z0-9]{11}";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(href);
String postalCode = matcher.find().group(0);
That's all, if you need anything else feel free to ask! Hope this helped you!
answered Nov 13 at 13:06
alvarobartt
1257
1257
Something is wrong with this code. Thanks for anyway
– Hakan
Nov 13 at 19:07
@Hakan no problem! ask me if you need anything else, that was just a sample code as guide! +1 if you found it useful!
– alvarobartt
Nov 14 at 8:35
This is the code how I scraped all other attributes...etc.
– Hakan
Nov 15 at 10:52
//Get the location of property Elements locations = items.get(i).getElementsByTag("h6"); //Get the post code of property Elements postcodes = items.get(i).getElementsByTag("h6.a[href]"); //Get the longitude Elements longitude = items.get(i).select("div");
– Hakan
Nov 15 at 10:52
foxtons.co.uk/… This is the link to web scraping.
– Hakan
Nov 15 at 10:53
add a comment |
Something is wrong with this code. Thanks for anyway
– Hakan
Nov 13 at 19:07
@Hakan no problem! ask me if you need anything else, that was just a sample code as guide! +1 if you found it useful!
– alvarobartt
Nov 14 at 8:35
This is the code how I scraped all other attributes...etc.
– Hakan
Nov 15 at 10:52
//Get the location of property Elements locations = items.get(i).getElementsByTag("h6"); //Get the post code of property Elements postcodes = items.get(i).getElementsByTag("h6.a[href]"); //Get the longitude Elements longitude = items.get(i).select("div");
– Hakan
Nov 15 at 10:52
foxtons.co.uk/… This is the link to web scraping.
– Hakan
Nov 15 at 10:53
Something is wrong with this code. Thanks for anyway
– Hakan
Nov 13 at 19:07
Something is wrong with this code. Thanks for anyway
– Hakan
Nov 13 at 19:07
@Hakan no problem! ask me if you need anything else, that was just a sample code as guide! +1 if you found it useful!
– alvarobartt
Nov 14 at 8:35
@Hakan no problem! ask me if you need anything else, that was just a sample code as guide! +1 if you found it useful!
– alvarobartt
Nov 14 at 8:35
This is the code how I scraped all other attributes...etc.
– Hakan
Nov 15 at 10:52
This is the code how I scraped all other attributes...etc.
– Hakan
Nov 15 at 10:52
//Get the location of property Elements locations = items.get(i).getElementsByTag("h6"); //Get the post code of property Elements postcodes = items.get(i).getElementsByTag("h6.a[href]"); //Get the longitude Elements longitude = items.get(i).select("div");
– Hakan
Nov 15 at 10:52
//Get the location of property Elements locations = items.get(i).getElementsByTag("h6"); //Get the post code of property Elements postcodes = items.get(i).getElementsByTag("h6.a[href]"); //Get the longitude Elements longitude = items.get(i).select("div");
– Hakan
Nov 15 at 10:52
foxtons.co.uk/… This is the link to web scraping.
– Hakan
Nov 15 at 10:53
foxtons.co.uk/… This is the link to web scraping.
– Hakan
Nov 15 at 10:53
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53247125%2fscrap-the-web-page-using-jsoup%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
What do you mean by "I only need postcode which is W2" ? Also, may you post something you tried?
– Subhasish Bhattacharjee
Nov 11 at 9:18
I just tried to show what data exactly I want to scrap. Please see the below
– Hakan
Nov 11 at 9:48
>Bayswater,</span> W2</a></h6>
– Hakan
Nov 11 at 9:48
This is my code which I tried to scrap
– Hakan
Nov 11 at 9:51
Elements postcodes = doc.select("span.property_address_location_name"); for (Element postcode : postcodes) { System.out.println(postcode.text()); }
– Hakan
Nov 11 at 9:51