This Wiki page tries to explain the philosophy behind choosing user names for UNIX systems. It has been created with the help of ChatGPT, but then reviewed by a DD for realism. It is a non-Debian augmentation of the UserAccounts page.
What is a Good User name?
Generally, literature suggests the following properties for choice of user names:
- Follow Standards: Stick to alphanumeric characters, underscores, and hyphens.
- Keep It Short and Descriptive: Aim for a balance between brevity and clarity.
- Avoid Reserved Words: Don’t use names that might conflict with commands or system processes.
Avoid special characters that might interfere with the system: Colons break /etc/passwd, slashes, periods and double periods break home directory paths, question marks or asterisks may invoke globbing in places where it's not appreciated, control characters break terminals and some interpunction characters might introduce security issues.
- Ensure Consistency: Use a consistent naming scheme across the organization.
- Think of Scalability: Plan for a growing user base without naming conflicts.
User name length
User names tend to become longer and longer over time. While historic UNIX systems restricted the length of a user name to 8 or 16 characters, Linux has rested for a long time on a 32 character limit that is still enforced by adduser at least for system accounts. shadow has advanced to a 255 limit, while it is currently not clear whether this means bytes or characters, which may be different regarding character encoding such as UTF-8
When discussing user name length, a number of factors need to be taken in account. Historic facts in a large number of software packages can interfere.
Tools like ls (when showing ownership with -l) or ps (when listing user processes) may truncate long usernames in their output. This can make it hard to distinguish users with similar prefixes in their names (e.g., verylongusername1 vs. verylongusername2 might both appear as verylong). Truncation often occurs at 8 or 16 characters, depending on the system and utility version.
Tools with columnar output (e.g., ls -l, ps aux) rely on fixed-width or dynamically calculated column sizes. Very long usernames can disrupt column alignment, causing misaligned data and/or fields overlapping into adjacent columns, making output unreadable.
the who and w commands display logged-in users, and long usernames can be truncated in terminal width-limited outputs and/or cause misalignment of subsequent columns.
Tools like mail and write may misbehave with usernames longer than 8 or 16 characters if they rely on the historical utmp and wtmp files, which often store usernames in fixed-width fields.
Some other tools might replace a symbolic name with the corresponding numeric uid/gid to save their formatting.
While modern systems often raise the limit significantly (like 255 characters), practical problems with traditional utilities typically arise at these thresholds:
- 8 Characters: The most common historical truncation limit, especially in utilities based on early UNIX assumptions.
- 16 Characters: A more common practical limit in modern implementations, though some utilities may struggle with anything beyond this.
- 32 Characters or more are rarely supported by older or unmaintained utilities and may exceed terminal width expectations, causing readability issues.
Basic UNIX User names
These are practical and straightforward usernames, typically conforming to conventions that ensure clarity and simplicity.
Characteristics:
- Short and Descriptive: Often use a single word or abbreviation
- Predictable: Easy to guess who the user might be or their role, such as root or guest.
- Standards-Compliant: Typically consist of lowercase alphanumeric characters, avoiding special symbols.
- Purpose-Oriented.
Examples:
- root - The all-powerful superuser account.
- user1, user2 - Default accounts during setup.
- guest - For temporary or anonymous users.
- mysql, nginx - to run those daemons under
- ftp - For filesytem ownership
- john, user123, dev_mike
Pros:
- Easy to remember.
- Low risk of causing technical issues.
- Ideal for organizational environments.
Suggested Character Rules:
- Alphanumeric Characters:
- Lowercase letters (a-z) are universally accepted and preferred.
- Numbers (0-9) are also valid, though a username cannot start with a number.
- Special Characters:
- The hyphen (-), underscore (_), and period (.) are typically allowed.
- Hyphens are often avoided as a leading character due to potential command-line ambiguities.
- Case Sensitivity:
- Usernames are case-sensitive, meaning john and John would be distinct accounts. However, lowercase is standard to avoid confusion.
Unnecessarily long usernames should be avoided; 8–16 characters is typically sufficient.
Sane UNIX User names
These usernames strike a balance between being practical and creative while maintaining readability and usability.
Characteristics:
- Human-Friendly: Often based on real names or logical abbreviations, such as alice.smith or jdoe.
- Structured: Might include patterns like first initial + last name (asmith), or role-based prefixes (dev01).
- Moderately Complex: May use underscores or hyphens (alice_smith, bob-jones), but avoid overly complicated characters.
Pros:
- Easier to manage in multi-user environments.
- More professional and scalable for teams.
UTF-8 Usernames
Unicode allows a broader range of characters:
- Umlauts and Diacritics: Characters like ä, é, ñ, and ø are supported in many configurations.
- Non-Latin Scripts: Usernames can include characters from scripts such as Cyrillic (Дмитрий), Greek (Ανδρέας), or even logograms (e.g., 王小明 in Chinese).
- Emoji or Symbols: While technically valid (😎, ✓user), these are highly discouraged due to practical and compatibility concerns.
Examples: müller, søren, 王明, андрей
The PRECIS Framework defines Proposed Internet Standards to handle Internationalized Strings in Application Protocols and https://www.rfc-editor.org/rfc/rfc8265.htmlUsernames and Passwords.
It is possible to create user names containing Unicode characters like ä or ꭥ does work in Debian 12 and in most other Linux distributions. This has been tested in December 2024 for Debian, Ubuntu, Fedora, CentOS Stream and Alma Linux. Most of the distributions expect a dedicated command line option to the call of the respective adduser tool allow such user names. This is probably not wanted.
One of the problems with UTF-8 user names is that it is possible to write strings that look identically in presentation but are different in comparision:
- Unicode characters can look visually identical to other characters from different scripts (e.g., the Latin letter "a" and the Cyrillic "а").
- The same applies to diacritical characters and accents such as the Scandinavian "å". There are both dedicated codepoints for the accented character and codepoints that allow the ring above to be combined with a regular lower case "a"
These ambiguities can lead to phishing or identity impersonation, as malicious actors can create usernames that resemble those of trusted users. The remedy for this is to at least use Normalization at least for comparision, duplicate detection and sorting.
At least Debian does not do Unicode Normalization for user names, and the author of these lines suspects that none of the other Linux distributions accepting UTF-8 user names do.
Other issues that Unicode user names might impose are that there are combination characters, characters that need to be presented with different widths, and direction changing characters to allow for right-to-left or up-down languages.
Insane UNIX Usernames
These are the unconventional, humorous, or downright problematic choices that can wreak havoc on systems or raise eyebrows.
Characteristics:
- Overly Complex: Contain excessive special characters, whitespace, or escape sequences (e.g., !$weird_user!, john*doe).
- Inappropriately Long: Exceed reasonable lengths, making them impractical to type or manage (e.g., this_is_an_exceedingly_long_username_that_no_one_can_remember_or_type).
- Confusing or Ambiguous: Names like null, 0, or yes that might conflict with commands or reserved keywords.
- Playful or Irreverent: Contain jokes, puns, or obscenities, e.g., sudo_me, h4x0r, slacker.
Prone to Errors: Names with spaces or case sensitivity issues (e.g., ?JohnDoe vs. johndoe).
Examples:
- rm -rf - Potentially catastrophic if misused or not escaped properly.
- null - Confusing, as it resembles the null keyword.
- $(malicious_command) - Could be exploited in scripts.
- i_am_god - Arrogant and unprofessional.
Pros:
- Often serve as amusing anecdotes or inside jokes.
Cons:
- May break scripts or tools.
- Difficult to type or work with.
- Potentially dangerous or disruptive.
Reserved Characters:
- user$name or user@name: May conflict with shell syntax.
- user*name or user?name: These characters are interpreted as wildcards or regex patterns.
- Whitespace: john doe – Spaces are not valid in most systems.
- Leading or Trailing Special Characters:
- -user or user- – Ambiguous when used in commands.
- .user or user. – Could cause issues with hidden files.
- Ambiguity with Commands or Reserved Names:
- null, yes, root – May conflict with system functions.
external Documentation
Other Operating Systems
Most of the information in this chapter was collected by Giuseppe Sacco on Debian-Devel - thanks!
Windows accepts any characters, except " / \ [ ] : ; | = , + * ? < > , and allows for 64 characters (or bytes, unsure).
SunOS has these restrictions: "a string of no more than thirty-two bytes consisting of characters from the set of alphabetic characters, numeric characters, period (.), underscore (_), and hyphen (-). The first character should be alphabetic and the field should contain at least one lowercase alphabetic character"
In LDAP the uid field is a "Directory String", so any non zero length UTF8 text. There is a note: Servers and clients MUST be prepared to receive arbitrary UCS code points, including code points outside the range of printable ASCII and code points not presently assigned to any character.
FreeBSD suggests to "use user names that consist of eight or fewer, all lower case characters in order to maintain backwards compatibility with applications." But the real syntax is: login name must not begin with a hyphen, and cannot contain 8-bit characters, tabs or spaces, or any of these symbols: `,:+&#%^()!@~*?<>=|\/";'. The dollar symbol is allowed only as the last character for use with Samba. No field may contain a colon as this has been used historically to separate the fields in the user database.
IBM AIX has these rules: must not begin with a hyphen (-), plus sign (+), at sign (@), or tilde (~). Additionally, do not use any of the following characters within a user-name string: :"#,=\/?'`. Finally, the login parameter cannot contain any space, tab, or newline characters.
On HP-UX user names are restricted to eight characters and group names to 16 character but you may change limits up to 254 characters. Anyway, it must start with a letter.
Kerberos syntax for principal is ?GeneralString constrained to contain only characters in IA5String (so, basically US-ASCII 7 bits), with this note: US-ASCII control characters should not be used.