PHP can edit DOM, so we can use this feature for make some SEO tricks for user inputted HTML, for example, blog post. If we will find images – let’s add title and alt for them. The same action we can do for links: add nofollow and target _blank for external links. And let’s do that!
PHP DOM : Quick intoduction on simple examples
Work with PHP DOM, is similar to work with others DOM-working adapters in PHP: you must load DOM into some variable, before start editing. If you are familiar with PHP SimpleXML – PHP DOM will not something new for you, it’s only another commands.
In this post I will edit this markup:
<p>As you can see, this site uses Highlight.js too and you can check how will looks our result. To get lib code, you can visit <a rel="nofollow" href="https://github.com/SashaDesigN/highlight.js">GitHub repo</a>. Or you can download customized version of lib on it <a href="https://highlightjs.org/download/">website</a>.</p>
<img data-after="images/1444298151.png">
<p>The code above do the injection of Highlight.js and init of it class, which do code more beautiful. Let's looking more deeper: this code highlights lib has more than 30 different styles, which you can find in <code>highlight/styles</code> a folder.</p>
And I will store this HTML in $_POST['text']
variable like I will send it from the form. Now we can load it into PHP:
$dom = new DomDocument();
$html = $_POST['text'];
$dom-<loadHTML($html);
We create the new instance of DomDocument
object and load our markup. Here is one painful thing: work with non-latin symbols and loadHTML. To fix this, you can use mb_convert_encoding
to convert text before using it in loadHTML.
$dom = new DomDocument();
$html = mb_convert_encoding($_POST['text'], 'HTML-ENTITIES', "UTF-8");
@$dom->loadHTML($html);
After loading of DOM we can do anything with our object. As I say before, in this post we will make our content more SEO-friendly. To to this, let’s start from external links.
PHP DOM: Add nofollow and _blank to links
Idea is a pretty simple: get all links, make a loop on all links and check if a current link has a target or nofollow attributes: if not – add them; after loop save changes in DOM.
# select all links
$links = $dom->getElementsByTagName('a');
if($links instanceof DOMNodeList){
foreach($links as $a){
# check if this is external link
if(strpos($a->attributes['href']->value,'http://')!==false){
$a->setAttribute('rel','nofollow');
$a->setAttribute('target','_blank');
}
}
}
#save changes
$text = $dom->saveHTML($dom);
In real, $links it’s not a selection of DOM elements, it’s a link to some list of elements, like & in PHP. Because of this the next line of code it’s the test instanceof DOMNodeList
, which check is $links variable a part of DOM or we select nothing.
Inside of the loop, we must check if the current link is external and if it’s true – set rel and target attributes.
PHP DOM: add alt and title to images
Now we can do the same action for all images inside our DOM:
$images = $dom->getElementsByTagName('img');
if($images instanceof DOMNodeList){
foreach($images as $i){
if( !$i->hasAttribute('alt') || !strlen($i->attributes['alt']->value)){
$i->setAttribute('alt', $ex['title']);
}
if( !$i->hasAttribute('title') || !strlen($i->attributes['title']->value)){
$i->setAttribute('title', $ex['title']);
}
}
}
After saveHTML inside $text you can see advanced HTML tags, which you didn’t send, for example <html>
and <body>
. To get them out I will like to use this PHP code, which cut our markup using PHP string methods:
$a = $dom->saveHTML($dom);
$a = substr($a,strpos($a, '<body>')+6,strlen($a));
$a = substr($a,0,strpos($a, '</body>'));
Now out the content are more SEO friendly and we can save it into the database. Don’t forget use mysql_real_escape_string before paste this HTML into you table.
PHP DOM is simple to use, but get a real power to play with our content as we need, and this post it’s a simple working example of it.