Wikipedia:Lua/Modul/URLutil/en

aus Wikipedia, der freien Enzyklopädie
Zur Navigation springen Zur Suche springen
Vorlagen-
programmierung
Diskussionen Lua Test Unterseiten
Modul Deutsch English

Esperanto Dolnoserbski Hornjoserbsce Modul: WP:Lua

URLutil – Module with functions for strings in context of internet addressing (URL; IP address – including IPv4 and IPv6 – as well as e-mail). Internationalized adresses (IRI) are also supported.

Supposing some benefit for a Wiki project, only persistent open access in world wide web is supported. Some special cases are not implemented, but hardly relevant:

  • IPv4 address not in common notation (dotted decimal)
  • URL with IPv6 host (in brackets; slightly opposing wikisyntax)
  • Authority with username

Functions for templates

[Quelltext bearbeiten]

Most functions expect exactly one unnamed parameter (which should be provided to get a meaningful answer). Whitespace ahead and after content is ignored.

The return value is an empty string (“nothing”), if the parameter value does not fulfil the expectations. If there is a result or the query condition is true, at least one visible character will be returned. The result does not begin or end with a space, and HTML entities will be decoded.

anchorencode
Encoding to be appropriate for id="" HTML attributes.
Parameter 2 – (optional) allow leading digit
decode
Decoding of URL encoded string, but [|] will be HTML-escaped
Parameter 2 – (optional) encoding
  • 2=QUERY – spaces as plus
  • 2=WIKI – sparse encoding, spaces as underscore
  • 2=PATH – spaces percent-encoded
encode
Encoding similar to parser function {{urlencode:}}
Critical characters at start wil be encoded as well as link brackets and pipe.
Parameter 2 – (optional) encoding
  • 2=QUERY – spaces as plus
  • 2=WIKI – sparse encoding, spaces as underscore
  • 2=PATH – spaces percent-encoded
getAuthority
Extract server access from a resource URL (lowercase result)
  • nothing – if invalid
getFragment
Extract fragment (if any) from a resource URL
Parameter 2 – (optional) decoding
  • 2=% – URL is %-coded
  • 2=WIKI – URL is Wiki-coded with dots and underscore
Result:
  • nothing – if not present
  • starting with # – if present
getHost
Extract domain or IP address from a resource URL (lowercase result)
  • nothing – if invalid
getLocation
Extract resource URL without a fragment, if any
getPath
Extract path from a resource URL without any query or fragment.
Beginning with / as basic resource identification.
getPort
Extract port number from a resource URL (numeric result)
  • nothing – if not present or invalid
getQuery
Extract query from a resource URL
Parameter 2 – (optional) single parameter name
Parameter 3 – alternative separator like ; – default: &
Result:
  • nothing – if not present
  • single value, if single parameter requested
getRelativePath
Extract path and query including fragment (if any) from a resource URL but relative to host.
getScheme
Extract scheme from a resource URL (lowercase result, including double slashes)
  • // – relative protocol
  • https:// – protocol
  • nothing – if beginning of URL is invalid
getTLD
Extract top level domain from a resource URL (lowercase result)
  • nothing – if invalid, or IP
getTop2domain
Extract first two top levels of domain from a resource URL (lowercase result)
  • nothing – if invalid, or IP
getTop3domain
Extract three top levels of domain from a resource URL (lowercase result)
  • nothing – if invalid, or IP
isAuthority
Is it a server address (also IP) of a resource, including port?
  • 1yes
isDomain
Is it a named domain, including sub domains?
  • 1yes
isDomainExample
Is it an example domain defined in RFC 2606 (example.com example.edu example.net example.org)?
  • 1yes
isDomainInt
Is it an Internationalized Domain Name (non-ASCII or Punycode)?
  • 1yes
isHost
Is it a server address without port (also IP)?
  • 1yes
isHostPathResource
Is it a resource URL or a resource URL without protocol part?
  • 1yes
isIP
Is it an IP address?
  • 4 if IPv4 (in common dotted decimal notation)
  • 6 if IPv6
  • nothing – else
isIPlocal
Is it an IPv4 address supposed to be local? RFC 1918, RFC 1122; even any like 0.0.0.0 (RFC 5735)
  • 1yes
isIPv4
Is it an IPv4 address in common notation (segmentation by dots, decimal)?
  • 1yes
isIPv6
Is it an IPv6 address?
  • 1yes
isMailAddress
Is it an e-mail address?
  • 1yes
isMailLink
Is it an e-mail link (mailto:)?
  • 1yes
isProtocolDialog
Is it an URL or scheme keyword, which could be used to initiate a dialog in a Wiki?
mailto, irc, ircs, ssh, telnet
  • 1yes
isProtocolWiki
Is it an URL or scheme keyword, which could point in a Wiki to a resource?
Relative protocol and ftp ftps git http https mms nntp sftp svn worldwind
Not desired are here: gopher, wais as well as mailto, irc, ircs, ssh, telnet.
  • 1yes
isResourceURL
Is it an URL, which provides general access to a resource? These are: relative protocol, http, https, ftp and also a valid host. Other URL might be used on project or functional pages, but not in encyclopedic context.
  • 1yes
isSuspiciousURL
Is it an URL, which might be syntactically problematic and might trigger a warning?
  • 1yes
isUnescapedURL
Is it an URL, where wikisyntax [ | ] is to be escaped?
  • 1yes
isWebURL
Is it a valid adress for a resource (any protocol)?
  • 1yes
wikiEscapeURL
Wikisyntax-safe escaping of [ | ] characters.
  • Identical with parameter, if no problematic character present.
  • Otherwise [ | ] replaced by webserver safe HTML entities. A pipe is not possible in plain template syntax.
failsafe
Version identification

The Failsafe interface is heading for version management of globally distributed Lua modules. It enables modules equipped with this interface to

  • ensure, that a library module required by a template or another module available as local copy does support certain functionality, or complain if not.
  • administrate global updating and linking of module codes via Wikidata.

The Failsafe interface is present both at template level and for direct Lua access.

The functions in detail are (not all supported yet completely by every library):

Parameter
Value Result current
nothing
false
local version ID »2024-10-29«
Minimal version version ID required at least
date in ISO format

It will be compared whether the current local implementation matches this version or later.

  • empty, if minimal version not achieved
  • 2001-01-01 → »2024-10-29«
  • 2099-01-01 → »«
wikidata version ID of global upstream
  • version ID at Wikidata
  • local, if not found there
»2024-10-29«
item ID of the Wikidata item
  • empty if not defined
Q10859193
~ Corresponding version ID locally and registered at Wikidata
  • empty, if up to date
  • version ID at Wikidata, if not equal
»«
@ Is the current (module) page linked correctly with Wikidata item?
  • empty, if linked to the item which is supposed
  • Iitem ID, if not
The return value is in template programming empty or under Lua false, otherwis a non-empty string as described.

Examples (test page)

[Quelltext bearbeiten]

A test page illustrates practical use.

Functions for Lua modules (API)

[Quelltext bearbeiten]

All functions described above can be used by other modules:

local lucky, URLutil = pcall( require, "Module:URLutil" )
if type( URLutil ) == "table" then
    URLutil = URLutil()
else
    -- failure; URLutil is the error message
    return "<span class='error'>" .. URLutil .. "</span>"
end

Subsequently there are available:

  • URLutil.anchorencode()
  • URLutil.decode()
  • URLutil.encode()
  • URLutil.getAuthority()
  • URLutil.getFragment()
  • URLutil.getHost()
  • URLutil.getLocation()
  • URLutil.getPath()
  • URLutil.getPort()
    numerical value, or false
  • URLutil.getQuery()
  • URLutil.getQueryTable(url, separator)
    table with all assignments key=value
  • URLutil.getRelativePath()
  • URLutil.getScheme()
  • URLutil.getTLD()
  • URLutil.getTop2domain()
  • URLutil.getTop3domain()
  • URLutil.isAuthority()
  • URLutil.isDomain()
  • URLutil.isDomainExample()
  • URLutil.isDomainInt()
  • URLutil.isHost()
  • URLutil.isIP()
    numerical 4, 6, or false
  • URLutil.isIPlocal()
  • URLutil.isIPv4()
  • URLutil.isIPv6()
  • URLutil.isMailAddress()
  • URLutil.isMailLink()
  • URLutil.isProtocolDialog()
  • URLutil.isProtocolWiki()
  • URLutil.isResourceURL()
  • URLutil.isSuspiciousURL()
  • URLutil.isUnescapedURL()
  • URLutil.isWebURL()
  • URLutil.wikiEscapeURL()
  • URLutil.failsafe( atleast )
    1. atleast
      optional
      nil or minimal version request or "wikidata"

Furthermore there are three string constants:

  • URLutil.serial – string, current version ID (date)
  • URLutil.suite – "URLutil"
  • URLutil.item – number, Item on Wikidata

General library; no limitations.

Dependencies

[Quelltext bearbeiten]

None.

  • mw: Uri library – other functionalities on general URI; but in particular helpful for Wiki-URL.

en:Module:IPAddress – 2013-03-01