Wikipedia:Lua/Modul/URLutil/en
Vorlagen- programmierung |
Diskussionen | Lua | Test | Unterseiten | ||||||
Modul | Deutsch | English
|
Esperanto | Dolnoserbski | Hornjoserbsce | Modul: | WP:Lua |
URLutil
– Module with functions for strings in context of internet addressing (URL; IP address – including IPv4 and IPv6 – as well as e-mail). Internationalized adresses (IRI) are also supported.
Supposing some benefit for a Wiki project, only persistent open access in world wide web is supported. Some special cases are not implemented, but hardly relevant:
- IPv4 address not in common notation (dotted decimal)
- URL with IPv6 host (in brackets; slightly opposing wikisyntax)
- Authority with username
Functions for templates
[Quelltext bearbeiten]Most functions expect exactly one unnamed parameter (which should be provided to get a meaningful answer). Whitespace ahead and after content is ignored.
The return value is an empty string (“nothing”), if the parameter value does not fulfil the expectations. If there is a result or the query condition is true, at least one visible character will be returned. The result does not begin or end with a space, and HTML entities will be decoded.
- anchorencode
- Encoding to be appropriate for
id=""
HTML attributes. - Parameter 2 – (optional) allow leading digit
- decode
- Decoding of URL encoded string, but
[|]
will be HTML-escaped - Parameter 2 – (optional) encoding
2=QUERY
– spaces as plus2=WIKI
– sparse encoding, spaces as underscore2=PATH
– spaces percent-encoded
- encode
- Encoding similar to parser function
{{urlencode:}}
- Critical characters at start wil be encoded as well as link brackets and pipe.
- Parameter 2 – (optional) encoding
2=QUERY
– spaces as plus2=WIKI
– sparse encoding, spaces as underscore2=PATH
– spaces percent-encoded
- getAuthority
- Extract server access from a resource URL (lowercase result)
- nothing – if invalid
- getFragment
- Extract fragment (if any) from a resource URL
- Parameter 2 – (optional) decoding
2=%
– URL is %-coded2=WIKI
– URL is Wiki-coded with dots and underscore
- Result:
- nothing – if not present
- starting with
#
– if present
- getHost
- Extract domain or IP address from a resource URL (lowercase result)
- nothing – if invalid
- getLocation
- Extract resource URL without a fragment, if any
- getPath
- Extract path from a resource URL without any query or fragment.
- Beginning with
/
as basic resource identification. - getPort
- Extract port number from a resource URL (numeric result)
- nothing – if not present or invalid
- getQuery
- Extract query from a resource URL
- Parameter 2 – (optional) single parameter name
- Parameter 3 – alternative separator like
;
– default:&
- Result:
- nothing – if not present
- single value, if single parameter requested
- getRelativePath
- Extract path and query including fragment (if any) from a resource URL but relative to host.
- getScheme
- Extract scheme from a resource URL (lowercase result, including double slashes)
//
– relative protocolhttps://
– protocol- nothing – if beginning of URL is invalid
- getTLD
- Extract top level domain from a resource URL (lowercase result)
- nothing – if invalid, or IP
- getTop2domain
- Extract first two top levels of domain from a resource URL (lowercase result)
- nothing – if invalid, or IP
- getTop3domain
- Extract three top levels of domain from a resource URL (lowercase result)
- nothing – if invalid, or IP
- isAuthority
- Is it a server address (also IP) of a resource, including port?
1
– yes
- isDomain
- Is it a named domain, including sub domains?
1
– yes
- isDomainExample
- Is it an example domain defined in RFC 2606 (example.com example.edu example.net example.org)?
1
– yes
- isDomainInt
- Is it an Internationalized Domain Name (non-ASCII or Punycode)?
1
– yes
- isHost
- Is it a server address without port (also IP)?
1
– yes
- isHostPathResource
- Is it a resource URL or a resource URL without protocol part?
1
– yes
- isIPlocal
- Is it an IPv4 address supposed to be local? RFC 1918, RFC 1122; even any like 0.0.0.0 (RFC 5735)
1
– yes
- isIPv4
- Is it an IPv4 address in common notation (segmentation by dots, decimal)?
1
– yes
- isIPv6
- Is it an IPv6 address?
1
– yes
- isMailAddress
- Is it an e-mail address?
1
– yes
- isMailLink
- Is it an e-mail link (mailto:)?
1
– yes
- isProtocolDialog
- Is it an URL or scheme keyword, which could be used to initiate a dialog in a Wiki?
mailto, irc, ircs, ssh, telnet
1
– yes
- isProtocolWiki
- Is it an URL or scheme keyword, which could point in a Wiki to a resource?
- Relative protocol and
ftp ftps git http https mms nntp sftp svn worldwind
- Not desired are here: gopher, wais as well as mailto, irc, ircs, ssh, telnet.
1
– yes
- isResourceURL
- Is it an URL, which provides general access to a resource? These are: relative protocol, http, https, ftp and also a valid host. Other URL might be used on project or functional pages, but not in encyclopedic context.
1
– yes
- isSuspiciousURL
- Is it an URL, which might be syntactically problematic and might trigger a warning?
1
– yes
- isUnescapedURL
- Is it an URL, where wikisyntax
[ | ]
is to be escaped?1
– yes
- isWebURL
- Is it a valid adress for a resource (any protocol)?
1
– yes
- wikiEscapeURL
- Wikisyntax-safe escaping of
[ | ]
characters.- Identical with parameter, if no problematic character present.
- Otherwise
[ | ]
replaced by webserver safe HTML entities. A pipe is not possible in plain template syntax.
- failsafe
- Version identification
The Failsafe interface is heading for version management of globally distributed Lua modules. It enables modules equipped with this interface to
- ensure, that a library module required by a template or another module available as local copy does support certain functionality, or complain if not.
- administrate global updating and linking of module codes via Wikidata.
The Failsafe interface is present both at template level and for direct Lua access.
The functions in detail are (not all supported yet completely by every library):
Value | Result | current |
---|---|---|
nothingfalse
|
local version ID | »2024-10-29« |
Minimal version | version ID required at least date in ISO format It will be compared whether the current local implementation matches this version or later.
|
|
wikidata
|
version ID of global upstream
|
»2024-10-29« |
item
|
ID of the Wikidata item
|
Q10859193
|
~
|
Corresponding version ID locally and registered at Wikidata
|
»« |
@
|
Is the current (module) page linked correctly with Wikidata item?
|
|
The return value is in template programming empty or under Lua false , otherwis a non-empty string as described.
|
Examples (test page)
[Quelltext bearbeiten]A test page illustrates practical use.
Functions for Lua modules (API)
[Quelltext bearbeiten]All functions described above can be used by other modules:
local lucky, URLutil = pcall( require, "Module:URLutil" )
if type( URLutil ) == "table" then
URLutil = URLutil()
else
-- failure; URLutil is the error message
return "<span class='error'>" .. URLutil .. "</span>"
end
Subsequently there are available:
- URLutil.anchorencode()
- URLutil.decode()
- URLutil.encode()
- URLutil.getAuthority()
- URLutil.getFragment()
- URLutil.getHost()
- URLutil.getLocation()
- URLutil.getPath()
- URLutil.getPort()
numerical value, orfalse
- URLutil.getQuery()
- URLutil.getQueryTable(url, separator)
table with all assignments key=value - URLutil.getRelativePath()
- URLutil.getScheme()
- URLutil.getTLD()
- URLutil.getTop2domain()
- URLutil.getTop3domain()
- URLutil.isAuthority()
- URLutil.isDomain()
- URLutil.isDomainExample()
- URLutil.isDomainInt()
- URLutil.isHost()
- URLutil.isIP()
numerical 4, 6, orfalse
- URLutil.isIPlocal()
- URLutil.isIPv4()
- URLutil.isIPv6()
- URLutil.isMailAddress()
- URLutil.isMailLink()
- URLutil.isProtocolDialog()
- URLutil.isProtocolWiki()
- URLutil.isResourceURL()
- URLutil.isSuspiciousURL()
- URLutil.isUnescapedURL()
- URLutil.isWebURL()
- URLutil.wikiEscapeURL()
- URLutil.failsafe( atleast )
- atleast
optional
nil or minimal version request or"wikidata"
- atleast
Furthermore there are three string constants:
- URLutil.serial – string, current version ID (date)
- URLutil.suite –
"URLutil"
- URLutil.item – number, Item on Wikidata
Usage
[Quelltext bearbeiten]General library; no limitations.
Dependencies
[Quelltext bearbeiten]None.
See also
[Quelltext bearbeiten]- mw: Uri library – other functionalities on general URI; but in particular helpful for Wiki-URL.
Antetype
[Quelltext bearbeiten]en:Module:IPAddress – 2013-03-01
- Unit tests: en:Module:IPAddress/testcases