JSON caching RSS feed for website

9:29 PM 1/8/2019

Above text in bold is just one sample taken from the many more daily quotations inside Quotationspage.com - many thanks for allowing me to use it here. Regardless of how many times you refresh / reload this page, it will show you different text only when cache is outdated (by a day or more).

Sometimes we need to pull an RSS feed (like daily news, daily quote, 5-days weather forecast, etc etc) from outside to enhance content of our website. And as the number of guests increases with time, number of requests to RSS provider also increases and this may exceed maximum traffic limitation put them.

To avoid this situation, we should use a proper caching mechanism. Physically, this means dynamically creating a temporary JSON (cache) file on our server. This cache file should save only the needed information and also the published date (or time) of the info.

Clearly this requires an RSS feed to have date / time field. It doesn't have that? It is a bad feed then and you should just look elsewhere for a better feed :)

See following diagram:

cache flow

So this diagram depicts the caching mechanism inside xhr.php code, as follows:

PHP

const LOCAL_URL = 'cache.json';

const REMOTE_HOST = 'rss-feed-provider.com';
const REMOTE_PATH = '/rss-feed-provider-path';
const REMOTE_PORT = 80; //for common HTTP PORT use 80; for https use 443
const REMOTE_PROTOCOL = ''; //for http use empty ''; for https use 'ssl://'
const REMOTE_MAX_CHARS_PER_LINE = 500; //assuming max of 500 chars per line to read

const TIMEOUT = 30; //by default PHP set this to default_socket_timeout = 60 (seconds)

function getRemoteXML(){
	$conn = fsockopen(REMOTE_PROTOCOL . REMOTE_HOST, REMOTE_PORT, $errCode, $errStr, TIMEOUT);
	if(!$conn)
		return FALSE;
	else{
		$str = 'GET ' . REMOTE_PATH . " HTTP/1.1\r\n";
		$str .= 'Host: ' . REMOTE_HOST . "\r\n";
		$str .= "Connection: Close\r\n\r\n";
		fwrite($conn, $str);
		
		$str = '';
		while(!feof($conn))
			$str .= fgets($conn, REMOTE_MAX_CHARS_PER_LINE);
		fclose($conn);
		
		//remove HTTP header lines
		list(, $strBody) = explode("\r\n\r\n", $str, 2);
		
		//remove feedburner's stylesheet lines
		$xmlStyled = new DOMDocument('1.0', 'utf-8');
		$xmlStyled->loadXML($strBody);
		
		$xml = new DOMDocument('1.0', 'utf-8');
		$xml->appendChild($xml->importNode($xmlStyled->getElementsByTagName('rss')->item(0), true));
		
		return $xml;
	}
}//end of getRemoteXML()

//when local file doesn't exist,
if(!file_exists(LOCAL_URL)){
	
	//read remote XML,
	$xml = getRemoteXML();
	if(!$xml){
		$ret['code'] = 0;
		$ret['msg'] = 'Can not load remote RSS.';
	}
	//get 1st quote details as a response,
	else{
		$quote = $xml->getElementsByTagName('item')->item(0);
		$date = new DateTime($quote->getElementsByTagName('pubDate')->item(0)->textContent);
		
		$cache['date'] = $date->format('Y-m-d');
		$cache['auth'] = $quote->getElementsByTagName('title')->item(0)->textContent;
		$cache['txt'] = $quote->getElementsByTagName('description')->item(0)->textContent;
		$cache['url'] = $quote->getElementsByTagName('link')->item(0)->textContent;
		
		//save response to local,
		file_put_contents(LOCAL_URL, json_encode($cache));
		
		//return response.
		$ret['code'] = 1;
		$ret['msg'] = $cache;
	}
}
//when local file exists,
else{
	//get today's date,
	$dateToday = new DateTime('now');

	//read local JSON,
	$cache = json_decode(file_get_contents(LOCAL_URL), true);
	$dateLocal = new DateTime($cache['date']);
	
	//if today's date = local JSON date then use local as cache as response.
	if($dateToday->format('Y-m-d') === $dateLocal->format('Y-m-d')){
		$ret['code'] = 2;
		$ret['msg'] = $cache;
	}
	//if today's date isn't the same as local JSON date then read remote XML,
	else{
		//read remote XML,
		$xml = getRemoteXML();
		if(!$xml){
			$ret['code'] = -1;
			$ret['msg'] = 'Can not load remote RSS.';
		}
		//get 1st quote details as a response,
		else{
			$quote = $xml->getElementsByTagName('item')->item(0);
			$date = new DateTime($quote->getElementsByTagName('pubDate')->item(0)->textContent);
		
			$cache = [];
			$cache['date'] = $date->format('Y-m-d');
			$cache['auth'] = $quote->getElementsByTagName('title')->item(0)->textContent;
			$cache['txt'] = $quote->getElementsByTagName('description')->item(0)->textContent;
			$cache['url'] = $quote->getElementsByTagName('link')->item(0)->textContent;
			
			//save response to local,
			file_put_contents(LOCAL_URL, json_encode($cache));
			
			//return response.
			$ret['code'] = 3;
			$ret['msg'] = $cache;
		}
	}
}
//always return JSON
header('Content-type: application/json; charset=utf-8;');
echo json_encode($ret);
exit;		

HTML

In this example, it is simply a one-line: <blockquote id="test"></blockquote>. When page loads, javascript will complete structure into something like below:

<blockquote id="test">
	<p>"A quote that says..."</p>
	<a href="https://..." target="_blank">author name...</a>
</blockquote>

Javascript

var xhr = new XMLHttpRequest();		
xhr.addEventListener('load', function(){
	var json = JSON.parse(this.responseText);
	console.log(json.code);
	if(json.code > 0){
		var test = document.getElementById('test'),
			el = test.appendChild(document.createElement('p'));
		el.textContent = json.msg.txt;
		el = test.appendChild(document.createElement('a'));
		el.appendChild(document.createTextNode(json.msg.auth));
		el.href = json.msg.url;
		el.target = '_blank';
	}
});
xhr.open('GET', 'xhr.php');
xhr.send();		

Comments