Wikipedia:Lua/Modul/Text/en
Vorlagenprogrammierung | Diskussionen | Lua | Test | Unterseiten | |||
Modul | Deutsch | English
|
Modul: | Dokumentation |
Text
– Module containing methods for the manipulation of text, wikimarkup and some HTML.
Functions for templates
[Quelltext bearbeiten]All methods have an unnamed parameter containing the text.
The return value is an empty string if the parameter does not meet the conditions. When the condition is matched or some result is successfully found, strings of at least one character are returned.
- char
- Creates a string from a list of character codes.
- 1
- Space-separated list of character codes
- *
- Number of repetitions of the list in parameter 1 (Default 1).
- errors
0
– Silence errors
- concatParams
- Combine any number of elements into a list, like
table.concat()
in Lua.- 1
- First element; missing and empty elements are ignored.
- 2 3 4 5 6 …
- Further list elements
- containsCJK
- Returns whether the input string contains any CJK characters
- Returns nothing if there are no CJK characters
- getPlain
- Remove wikimarkup (except templates): comments, tags, bold, italic, nbsp
- isLatinRange
- Returns some content, unless the string contains a character that would not normally be found in Latin text.
- Returns nothing if there is a non-Latin string.
- isQuote
- Returns some content if the parameter passed is a single character, and that character is a quote, such as
'
.- Returns nothing for multiple characters, or if the character passed is not a quote.
- listToText
- Formats list elements analogously to mw.text.listToText().
- The elements are separated by a comma and space; the word “and” appears between the first and last.
- Unnamed parameters become the list items.
- Optional parameters for
#invoke
:format
– Every list element will first be formatted with this format string; see here for how to construct this string. The string must contain at least one%s
sequence.template=1
– List elements should be taken from the calling template.
- Returns the resulting string.
- quote
- Wrap the string in quotes; quotes can be chosen for a specific language.
- 1
- Input text (will be automatically trimmed); may be empty.
- 2
- (optional) the ISO 639 language code for the quote marks; should be one of the supported languages
- 3
- (optional)
2
for second level quotes. This means the single quote marks in a statement such as: Jack said, “Jill said ‘fish’ last Tuesday.”
- quoteUnquoted
- Wrap the string in quotes; quotes can be chosen for a specific language. Will not quote an empty string, and will not quote if there is a quote at the start or end of the (trimmed) string.
- 1
- Input text (will be automatically trimmed); may be empty.
- 2
- (optional) the ISO 639 language code for the quote marks; should be one of the supported languages
- 3
- (optional)
2
for second level quotes. This means the single quote marks in a statement such as: Jack said, “Jill said ‘fish’ last Tuesday.”
- removeDiacritics
- Removes all diacritical marks from the input.
- 1
- Input text; may be empty.
- removeWhitespace
- Remove all whitespace, or replace with ASCII space
- 1
- Input text
- sentenceTerminated
- Is this sentence terminated? Should work with CJK, and allows quotation marks to follow.
- Returns nothing if the sentence is unterminated.
- tokenWords
- Split text in single space separated words of digits or letters only, but no line breaks nor punctuation or other non-alphanumeric characters.
- 1
- Input text; may be empty. Might contain HTML entities which will be resolved.
- ucfirstAll
- The first letter of every recognized word is converted to upper case. This contrasts with the parser function {{ucfirst:}} which changes only the first character of the whole string passed.
- A few common HTML entities are protected; the implementation of this may mean that numerical entities passed (e.g.
&)
are converted to&
form - unstrip
- Remove all elements which probably have been evaluated by an extension.
- uprightNonlatin
- Takes a string. Italicized non-Latin characters are un-italicized, unless they are a single Greek letter.
- zip
- Combines a tuple of lists by convolution. This is easiest to explain by example: given two lists, list1 = "a b c" and list2 = "1 2 3", then
zip(liste1, liste2, sep = " ", isep = "-", osep = "/")
outputs
a-1/b-2/c-3
- 1, 2, 3, … – Lists to be combined
sep
– A separator (in Lua regex form) used to split the lists. If empty, the lists are split into individual characters.sep1
,sep2
,sep3
, … – Allows a different separator to be used for each list.isep
– Output separator; placed between elements which were at the same index in their lists.osep
– Output separator; placed between elements which had different original indices; i.e. between the groups joined withisep
- failsafe
- Version management
The Failsafe interface is heading for version management of globally distributed Lua modules. It enables modules equipped with this interface to
- ensure, that a library module required by a template or another module available as local copy does support certain functionality, or complain if not.
- administrate global updating and linking of module codes via Wikidata.
The Failsafe interface is present both at template level and for direct Lua access.
The functions in detail are (not all supported yet completely by every library):
Value | Result | current |
---|---|---|
nothingfalse
|
local version ID | »2024-11-25« |
Minimal version | version ID required at least date in ISO format It will be compared whether the current local implementation matches this version or later.
|
|
wikidata
|
version ID of global upstream
|
»2024-11-25« |
item
|
ID of the Wikidata item
|
Q29387871
|
~
|
Corresponding version ID locally and registered at Wikidata
|
»« |
@
|
Is the current (module) page linked correctly with Wikidata item?
|
|
The return value is in template programming empty or under Lua false , otherwis a non-empty string as described.
|
Examples and test page
[Quelltext bearbeiten]There are tests available to illustrate this in practice.
Functions for Lua modules (API)
[Quelltext bearbeiten]All of the above functions can be called from other Lua modules, if not Lua library function anyway. Use require()
; the below code checks for errors loading it:
local lucky, Text = pcall( require, "Module:Text" )
if type( Text ) == "table" then
Text = Text()
else
-- In the event of errors, Text is an error message.
return "<span class=\"error\">" .. Text .. "</span>"
end
You may then call:
- Text.char( apply, again, accept )
- apply
table (sequence)
Each element is either (decimal) number, as codepoint, or string which will be sprinkled in. - again
optional number
Number of repetitions of the list in apply; defaults to:1
- accept
optionaltrue
Suppress error message.
- apply
- Text.concatParams( args, apply, adapt )
- args
table (sequence)
Elements as string. - apply
optional string
Separator between elements; defaults to:|
- adapt
optional string
formatting, which will be applied to each element; must contain:%s
- args
- Text.containsCJK( a )
- returns:
false
if not, elsetrue
.
- returns:
- Text.getPlain( a )
- Text.isLatinRange( a )
- returns:
false
if not, elsetrue
.
- returns:
- Text.isQuote( a )
- returns:
false
if not, elsetrue
.
- returns:
- Text.listToText( args, adapt )
- args
table (sequence)
Elements as string. - adapt
optional string
formatting, which will be applied to each element; must contain:%s
- args
- Text.quote( apply, alien, advance )
- apply
string
Text. - alien
optional string
Language code. - advance
optional number
Level; defaults to:2
- apply
- Text.quoteUnquoted( apply, alien, advance )
- apply
string
Text. - alien
optional string
Language code. - advance
optional number
Level; defaults to:2
- apply
- Text.removeDiacritics( a )
- Text.removeWhitespace( a )
- Text.sentenceTerminated( a )
- returns:
false
if not, elsetrue
.
- returns:
- Text.tokenWords( a )
- Text.ucfirstAll( a )
- Text.uprightNonlatin( a )
Text.zip(…)- Text.failsafe( atleast )
- atleast
optional
nil or minimum version orwikidata
or~
for synchronisation
- returns: Version ID as string or
false
- atleast
- Text.test( a )
Usage
[Quelltext bearbeiten]This is a general library; use it anywhere.
Installation on other WMF projects
[Quelltext bearbeiten]Follow the steps:
- Copy main module
Module:Text
into your project.- If possible keep the name
Text
. - If another name is required due to conflict or naming convention or non-latin script then choose a different one.
- If possible keep the name
- Register this module at d:Q29387871.
- To support quotation marks depending on language, you may copy the following sub module, and keep the chosen root name:
/quoting
- Adaption to local language and policy may be needed.
- One day more globalized support for language dependent quotation marks might become available.
- Ready.
- Consider translation of doc page.