As part of the development for imageGet, I needed to extract a file extension from a supplied URL. Specifically, I needed to pull extensions that could possibly be images (although for future compatibility's sake, I did not want to explicitly list them in the search itself). Determining a file type by looking at its extension is workable, but as file extension is no guarantee of file type, you do need to follow up with MIME-type checking afterwards.
Since I'm on my regex streak, I ended up doing it with regular expressions, because why not?
var strA = 'http://www.feedseed.com/image.jpg', strB = '.com/image.jpg?abcdef', strC = 'config.inc.php', strD = 'image.jpg#lolol', strE = 'feedseed.com/.htaccess', regex = /\.([a-zA-z]{3,4})(?:[\?#].+)?$/; console.log(strA.match(regex)); console.log(strB.match(regex)); console.log(strC.match(regex)); console.log(strD.match(regex)); console.log(strE.match(regex), 'No match because .htaccess is > 4 characters, not a valid image extension');
... or alternatively, in PHP:
<?php $url = 'http://domain.tld/image.jpg?queryString#HashAsWell'; preg_match('/\.([a-zA-z]{3,4})(?:[\?#].+)?$/', $url, $ext); $ext = $ext[1]; ?>
Play around with the fiddle here. The expression also accounts for
querystrings and hashes being present, and only retrieves the trailing letters after a dot, so it
wouldn't match .com in domain.com.
A non-regex solution would involve taking all content after the trailing period, and cutting off everything after the first non-alphabetical character. Given that regex is slow1, a regex solution might end up being slower than a non-regex solution. I'd love to see competing functions tackle this one!
====
1 ... compared to simple string search. So I've been told — I haven't actually done any benchmarking to see whether or not that accusation is unfounded of not.