Url
in package
implements
Stringable
uses
BackwardCompatibility
Represents a URL string and allows performing various operations on the URL.
Most importantly, this class allows transparent handling of URLs that contain international characters (a.k.a. IRIs), so that they can easily be sanitized, normalized, validated, etc. This class also makes it easy to convert IRIs to raw ASCII URLs and back.
Table of Contents
Interfaces
- Stringable
Constants
- SCHEME_GRAVATAR = 'gravatar'
- SCHEME_HTTP = 'http'
- SCHEME_HTTPS = 'https'
Properties
- $basic_tlds : array<string|int, mixed>
- $cc_tlds : array<string|int, mixed>
- $fragment : string
- $host : string
- $pass : string
- $path : string
- $port : int
- $query : string
- $scheme : string|null
- $special_use_tlds : array<string|int, mixed>
- $user : string
- $is_ascii : bool
- $url : string|false
Methods
- __construct() : mixed
- Constructor.
- __toString() : string
- Return the string.
- create() : self
- Convenience wrapper for constructor.
- exportStatic() : void
- Provides a way to export a class's public static properties and methods to global namespace.
- hasSSL() : bool
- Checks if this URL has an SSL certificate.
- isGravatar() : bool
- Checks if this is a Gravatar URL.
- isScheme() : bool
- Check if this URL uses one of the specified schemes.
- isValid() : bool
- Checks whether this is a valid IRI. Makes no changes.
- isWebsite() : bool
- Checks if this URL points to a website.
- normalize() : self
- Performs Unicode normalization on the URL.
- parse() : string|int|array<string|int, mixed>|null|bool
- A wrapper for `parse_url()` that can handle URLs with international characters (a.k.a. IRIs)
- proxied() : self
- Gets the appropriate URL to use for images (or whatever) when using SSL.
- redirectsToHttps() : bool
- Checks if this URL has a redirect to https:// by querying headers.
- sanitize() : self
- Removes illegal characters from the URL.
- setTldRegex() : void
- Creates an optimized regex to match all known top level domains.
- toAscii() : self
- Converts an IRI (a URL with international characters) into an ASCII URL.
- toUtf8() : self
- Decodes a URL containing encoded international characters to UTF-8.
- validate() : self
- Checks whether this is a valid IRI, and sets $this->url to '' if not.
- checkIfAscii() : void
- Checks whether $this->url contains only ASCII characters.
Constants
SCHEME_GRAVATAR
public
mixed
SCHEME_GRAVATAR
= 'gravatar'
SCHEME_HTTP
public
mixed
SCHEME_HTTP
= 'http'
SCHEME_HTTPS
public
mixed
SCHEME_HTTPS
= 'https'
Properties
$basic_tlds
public
static array<string|int, mixed>
$basic_tlds
= ['com', 'net', 'org', 'edu', 'gov', 'mil', 'aero', 'asia', 'biz', 'cat', 'coop', 'info', 'int', 'jobs', 'mobi', 'museum', 'name', 'post', 'pro', 'tel', 'travel', 'xxx']
The 2012 list of top level domains, excluding ccTLDs.
$cc_tlds
public
static array<string|int, mixed>
$cc_tlds
= ['ac', 'ad', 'ae', 'af', 'ag', 'ai', 'al', 'am', 'ao', 'aq', 'ar', 'as', 'at', 'au', 'aw', 'ax', 'az', 'ba', 'bb', 'bd', 'be', 'bf', 'bg', 'bh', 'bi', 'bj', 'bm', 'bn', 'bo', 'br', 'bs', 'bt', 'bv', 'bw', 'by', 'bz', 'ca', 'cc', 'cd', 'cf', 'cg', 'ch', 'ci', 'ck', 'cl', 'cm', 'cn', 'co', 'cr', 'cu', 'cv', 'cx', 'cy', 'cz', 'de', 'dj', 'dk', 'dm', 'do', 'dz', 'ec', 'ee', 'eg', 'er', 'es', 'et', 'eu', 'fi', 'fj', 'fk', 'fm', 'fo', 'fr', 'ga', 'gb', 'gd', 'ge', 'gf', 'gg', 'gh', 'gi', 'gl', 'gm', 'gn', 'gp', 'gq', 'gr', 'gs', 'gt', 'gu', 'gw', 'gy', 'hk', 'hm', 'hn', 'hr', 'ht', 'hu', 'id', 'ie', 'il', 'im', 'in', 'io', 'iq', 'ir', 'is', 'it', 'je', 'jm', 'jo', 'jp', 'ke', 'kg', 'kh', 'ki', 'km', 'kn', 'kp', 'kr', 'kw', 'ky', 'kz', 'la', 'lb', 'lc', 'li', 'lk', 'lr', 'ls', 'lt', 'lu', 'lv', 'ly', 'ma', 'mc', 'md', 'me', 'mg', 'mh', 'mk', 'ml', 'mm', 'mn', 'mo', 'mp', 'mq', 'mr', 'ms', 'mt', 'mu', 'mv', 'mw', 'mx', 'my', 'mz', 'na', 'nc', 'ne', 'nf', 'ng', 'ni', 'nl', 'no', 'np', 'nr', 'nu', 'nz', 'om', 'pa', 'pe', 'pf', 'pg', 'ph', 'pk', 'pl', 'pm', 'pn', 'pr', 'ps', 'pt', 'pw', 'py', 'qa', 're', 'ro', 'rs', 'ru', 'rw', 'sa', 'sb', 'sc', 'sd', 'se', 'sg', 'sh', 'si', 'sj', 'sk', 'sl', 'sm', 'sn', 'so', 'sr', 'ss', 'st', 'su', 'sv', 'sx', 'sy', 'sz', 'tc', 'td', 'tf', 'tg', 'th', 'tj', 'tk', 'tl', 'tm', 'tn', 'to', 'tr', 'tt', 'tv', 'tw', 'tz', 'ua', 'ug', 'uk', 'us', 'uy', 'uz', 'va', 'vc', 've', 'vg', 'vi', 'vn', 'vu', 'wf', 'ws', 'ye', 'yt', 'za', 'zm', 'zw']
Country code top level domains.
$fragment
public
string
$fragment
The fragment component of the URL.
$host
public
string
$host
The host component of the URL.
$pass
public
string
$pass
The password component of the URL.
$path
public
string
$path
The path component of the URL.
$port
public
int
$port
The port component of the URL.
$query
public
string
$query
The query component of the URL.
$scheme
public
string|null
$scheme
= null
The scheme component of the URL.
$special_use_tlds
public
static array<string|int, mixed>
$special_use_tlds
= ['local', 'onion', 'test']
"Special use domain names" that aren't in DNS but may possibly resolve.
See https://www.iana.org/assignments/special-use-domain-names.
$user
public
string
$user
The user component of the URL.
$is_ascii
protected
bool
$is_ascii
Whether this contains only ASCII characters.
If not set, unknown.
$url
protected
string|false
$url
The URL string, or false if invalid.
Methods
__construct()
Constructor.
public
__construct(string $url[, bool $normalize = false ]) : mixed
Parameters
- $url : string
-
The URL or IRI.
- $normalize : bool = false
-
Whether to normalize the URL during construction. Default: false.
__toString()
Return the string.
public
__toString() : string
Return values
stringcreate()
Convenience wrapper for constructor.
public
static create(string $url[, bool $normalize = false ]) : self
This is just syntactical sugar to ease method chaining.
Parameters
- $url : string
-
The URL or IRI.
- $normalize : bool = false
-
Whether to normalize the URL during construction. Default: false.
Return values
self —The created object.
exportStatic()
Provides a way to export a class's public static properties and methods to global namespace.
public
static exportStatic() : void
To do so:
- Use this trait in the class.
- At the END of the class's file, call its exportStatic() method.
Although it might not seem that way at first glance, this approach conforms to section 2.3 of PSR 1, since executing this method is simply a dynamic means of declaring functions when the file is included; it has no other side effects.
Regarding the $backcompat items:
A class's static properties are not exported to global variables unless explicitly included in $backcompat['prop_names'].
$backcompat['prop_names'] is a simple array where the keys are the names of one or more of a class's static properties, and the values are the names of global variables. In each case, the global variable will be set to a reference to the static property. Static properties that are not named in this array will not be exported.
Adding non-static properties to the $backcompat arrays will produce runtime errors. It is the responsibility of the developer to make sure not to do this.
hasSSL()
Checks if this URL has an SSL certificate.
public
hasSSL() : bool
Return values
bool —Whether the URL has an SSL certificate.
isGravatar()
Checks if this is a Gravatar URL.
public
isGravatar() : bool
Return values
bool —Whether this is a Gravatar URL.
isScheme()
Check if this URL uses one of the specified schemes.
public
isScheme(string|array<string|int, string> $scheme) : bool
Parameters
- $scheme : string|array<string|int, string>
-
Schemes to check.
Return values
bool —Whether the URL matches a scheme.
isValid()
Checks whether this is a valid IRI. Makes no changes.
public
isValid([int $flags = 0 ]) : bool
Similar to filter_var($url, FILTER_SANITIZE_URL, $flags)
, except that
it correctly handles URLs with international characters (a.k.a. IRIs), it
recognizes schemeless URLs like '//www.example.com', and it only returns
a boolean rather than a mixed value.
Parameters
- $flags : int = 0
-
Optional flags for filter_var's third parameter.
Return values
bool —Whether this is a valid IRI.
isWebsite()
Checks if this URL points to a website.
public
isWebsite() : bool
Return values
bool —Whether the URL matches the https or http schemes.
normalize()
Performs Unicode normalization on the URL.
public
normalize() : self
Internally calls $this->sanitize(), then performs Unicode normalization on the URL as a whole, using NFKC normalization for the domain name (see RFC 3491) and NFC normalization for the rest.
Return values
self —A reference to this object for method chaining.
parse()
A wrapper for `parse_url()` that can handle URLs with international characters (a.k.a. IRIs)
public
parse([int $component = -1 ]) : string|int|array<string|int, mixed>|null|bool
Parameters
- $component : int = -1
-
Optional flag for parse_url's second parameter.
Return values
string|int|array<string|int, mixed>|null|bool —Same as parse_url(), but with unmangled Unicode.
proxied()
Gets the appropriate URL to use for images (or whatever) when using SSL.
public
proxied() : self
The returned URL may or may not be a proxied URL, depending on the situation.
Mods can implement alternative proxies using the 'integrate_proxy' hook.
Return values
self —A new instance of this class for the proxied URL.
redirectsToHttps()
Checks if this URL has a redirect to https:// by querying headers.
public
redirectsToHttps() : bool
Return values
bool —Whether a redirect to HTTPS was found.
sanitize()
Removes illegal characters from the URL.
public
sanitize() : self
Unlike filter_var($url, FILTER_SANITIZE_URL)
, this correctly handles
URLs with international characters (a.k.a. IRIs).
Return values
self —A reference to this object for method chaining.
setTldRegex()
Creates an optimized regex to match all known top level domains.
public
static setTldRegex([bool $update = false ]) : void
The optimized regex is stored in Config::$modSettings['tld_regex'].
To update the stored version of the regex to use the latest list of valid TLDs from iana.org, set the $update parameter to true. Updating can take some time, based on network connectivity, so it should normally only be done by calling this function from a background or scheduled task.
If $update is not true, but the regex is missing or invalid, the regex will be regenerated from a hard-coded list of TLDs. This regenerated regex will be overwritten on the next scheduled update.
Parameters
- $update : bool = false
-
If true, fetch and process the latest official list of TLDs from iana.org.
toAscii()
Converts an IRI (a URL with international characters) into an ASCII URL.
public
toAscii() : self
Uses Punycode to encode any non-ASCII characters in the domain name, and uses standard URL encoding on the rest.
Return values
self —A reference to this object for method chaining.
toUtf8()
Decodes a URL containing encoded international characters to UTF-8.
public
toUtf8() : self
Decodes any Punycode encoded characters in the domain name, then uses standard URL decoding on the rest.
Return values
self —A reference to this object for method chaining.
validate()
Checks whether this is a valid IRI, and sets $this->url to '' if not.
public
validate([int $flags = 0 ]) : self
Parameters
- $flags : int = 0
-
Optional flags for filter_var's third parameter.
Return values
self —A reference to this object for method chaining.
checkIfAscii()
Checks whether $this->url contains only ASCII characters.
protected
checkIfAscii() : void
Sets the value of $this->is_ascii to the result.