ChatGPT解决这个技术问题 Extra ChatGPT

How can I find the link URL by link text with XPath?

I have a well formed XHTML page. I want to find the destination URL of a link when I have the text that is linked.

Example

<a href="http://stackoverflow.com">programming questions site</a>
<a href="http://cnn.com">news</a>

I want an XPath expression such that if given programming questions site it will give http://stackoverflow.com and if I give it news it will give http://cnn.com.


T
Tim Cooper

Should be something similar to:

//a[text()='text_i_want_to_find']/@href

will I ever learn xpath? when I see a query it is so obvious and easy to understand... but I am never able to write one on my own
@flybywire If you read this Stanford's free Introduction to Databases course has a good section on XML and XPath.
Instead of text(), you can use ".=", for example //a[.='Register here']
What if I don't know the text? Can I select the nodes which contains http or certain keyword?
D
David Moles

Too late for you, but for anyone else with the same question...

//a[contains(text(), 'programming')]/@href

Of course, 'programming' can be any text fragment.


This one is more generalized. Good share
This is case sensitive. Can I ignore the case here?
D
David Moles
//a[text()='programming quesions site']/@href 

which basically identifies an anchor node <a> that has the text you want, and extracts the href attribute.


P
Peter Mortensen

Think of the phrase in the square brackets as a WHERE clause in SQL.

So this query says, "select the "href" attribute (@) of an "a" tag that appears anywhere (//), but only where (the bracketed phrase) the textual contents of the "a" tag is equal to 'programming questions site'".


Hi Peter, do you have any tutorial site to learn xpath query?
D
David Moles

For case insensitive contains, use the following:

//a[contains(translate(text(),'PROGRAMMING','programming'), 'programming')]/@href

translate converts capital letters in PROGRAMMING to lower case programming.


Please don't add "thanks" as answers. Invest some time in the site and you will gain sufficient privileges to upvote answers you like, which is the Stack Overflow way of saying thank you.
"Thanks" wasn't my "answer". I was, in a way, giving credit to an answer above that I improved on.
A
Adi Lester

if you are using html agility pack use getattributeValue:

$doc2.DocumentNode.SelectNodes("//div[@class='className']/div[@class='InternalClass']/a[@class='InternalClass']").GetAttributeValue("href","")