ChatGPT解决这个技术问题 Extra ChatGPT

What is the standard format for a browser's User-Agent string?

Is there an RFC, official standard, or template for creating a User Agent string? The iphone's user-agent string seems strange...

Mozilla/5.0 (iPhone; U; CPU iPhone OS 3_1_2 like Mac OS X; en-us) AppleWebKit/528.18 (KHTML, like Gecko) Version/4.0 Mobile/7D11 Safari/528.16

The iPhone seriously puts Mozilla/5.0 at the beginning of it's user agent?
@Slokun why the surprise? IE user-agent starts with Mozilla/4.0. Remember that Mozilla one of the first browsers to be made, and all others include, to various degrees, parts of its foundation.
The explanation on useragentstring.com is that it should just mean Gecko-based browsers (Netscape and Firefox) but most other browsers include it to say they're Mozilla-compatible.
Think of Mozilla/ as "not Lynx". Generally text-only = not Mozilla-compatible. Some old WML/HDML feature-phone browsers also don't identify as Mozilla. (Fun fact: all the browsers before Lynx died of dysentery or were eaten by grues.)

D
Dai

The User-Agent header is part of the RFC7231, which is an improved version of the RFC1945, where it states:

The User-Agent request-header field contains information about the user agent originating the request. This is for statistical purposes, the tracing of protocol violations, and automated recognition of user agents for the sake of tailoring responses to avoid particular user agent limitations. User agents SHOULD include this field with requests. The field can contain multiple product tokens (section 3.8) and comments identifying the agent and any subproducts which form a significant part of the user agent. By convention, the product tokens are listed in order of their significance for identifying the application.

EBNF Definitions:

   User-Agent      = "User-Agent" ":" 1*( product | comment )

Where product is defined as:

   product         = token ["/" product-version]
   product-version = token
   token           = 1*<any CHAR except CTLs or separators>

And comment as:

   comment         = "(" *( ctext | quoted-pair | comment ) ")"
   ctext           = <any TEXT excluding "(" and ")">

And other rules, for reference:

   CTL             = <control characters, e.g. ASCII 0x00 through 0x0F and 0x7F>
   separators      = "(" | ")" | "<" | ">" | "@"
                     "," | ";" | ":" | "\" | <">
                     "/" | "[" | "]" | "?" | "="
                     "{" | "}" | SP | HT
   SP              = <ASCII space 0x20, i.e. " ">
   HT              = <ASCII horizontal tab 0x09, aka '\t'>

Note that this means that product strings cannot contain spaces, but comment strings can.

Examples:

Here are some valid examples of product strings (with and without product-version strings):

# Single `product` without product-version:
Foobar
Foobar-baz

# Single `product` with product-version:
Foobar/abc
Foobar/1.0.0
Foobar/2021.44.30.15-b917dc

Here are some valid examples of comment strings; note how all strings are enclosed in matched parentheses ( ):

# This was the default `comment` used by Internet Explorer 11:
(Windows NT 6.1; WOW64; Trident/7.0; rv:11.0)

# You can put almost any text inside a comment:
(Why are you looking at HTTP headers? Go outside, find love, do some good in the world)

# Note that `comment` strings can also be nested, provided their delimiting parentheses are matched, for example:
(Outer comment (Inner comment))

As a User-Agent header's value is comprised of arbitrary product and comment strings, these are all valid User-Agent headers:

User-Agent: Foobar
User-Agent: Foobar/2021.44.30.15-b917dc
User-Agent: MyProduct Foobar/2021.44.30.15-b917dc
User-Agent: Tsom/OfraHaza (Life is short and love is always over in the morning) AnotherProduct

Thanks, this is exactly what I was looking for. There doesn't appear to be a standard format for the comment field.
Some examples of this, for readers unfamiliar with EBNF, would be ideal. (=
The referenced RFC is now obsolete. tools.ietf.org/html/rfc7231 obviates it.
Funnily enough, RFC 7231 specifically calls out "us[ing] the product tokens of other implementations in order to declare compatibility with them" as a Bad Idea.
Can the User-Agent string be in any character set? Can it contain for example Russian or Chinese characters?
C
Community

This is specified in RFC 1945 in the section on Request Headers. It is not a very standardized format, though, and user agents tend to put whatever they want in there.


Thx! Your answer combined with Paulo's make the complete answer.
You're welcome! It looks like Paulo's is actually more complete and up-to-date, so feel free to mark his as accepted.
w
wlk

Yes, see: mozilla website, but as it was mentioned before. Basically you can put whatever you want there. For statistical/analytical purposes, the most important thing is, that every browser/os should have this standardized for itself.


关注公众号,不定期副业成功案例分享
Follow WeChat

Success story sharing

Want to stay one step ahead of the latest teleworks?

Subscribe Now