web: add support for URI-reference

Based on a patch by Daniel Hartwig <mandyke@gmail.com>.

* NEWS: Update.
* doc/ref/web.texi (URIs): Fragments are properly part of a URI, so
  remove the incorrect note.  Add documentation on URI subtypes.
* module/web/uri.scm (uri-reference?): New base type predicate.
  (uri?, relative-ref?): Specific predicates.
  (validate-uri-reference): Strict validation.
  (validate-uri, validate-relative-ref): Specific validators.
  (build-uri-reference, build-relative-ref): New constructors.
  (string->uri-reference): Rename from string->uri.
  (string->uri, string->relative-ref): Specific constructors.
  (uri->string): Add #:include-fragment? keyword argument.
* module/web/http.scm (parse-request-uri): Use `build-uri-reference',
  and result is a URI-reference, not URI, object.  No longer infer an
  absent `uri-scheme' is `http'.
  (write-uri): Just use `uri->string'.
  (declare-uri-header!): Remove unused function.
  (declare-uri-reference-header!): Update.  Rename from
  `declare-relative-uri-header!'.
* test-suite/tests/web-uri.test ("build-uri-reference"):
  ("string->uri-reference"): Add.
  ("uri->string"): Also tests for relative-refs.
* test-suite/tests/web-http.test ("read-request-line"):
  ("write-request-line"): Update for no scheme in some URIs.
  ("entity headers", "request headers"): Content-location, Referer, and
  Location should also parse relative-URIs.
* test-suite/tests/web-request.test ("example-1"): Expect URI-reference
  with no scheme.
This commit is contained in:
Andy Wingo 2017-05-21 11:56:59 +02:00
commit 7095a536f3
9 changed files with 340 additions and 148 deletions

18
NEWS
View file

@ -8,6 +8,24 @@ Please send Guile bug reports to bug-guile@gnu.org.
Changes in 2.2.3 (since 2.2.2): Changes in 2.2.3 (since 2.2.2):
* New interfaces
** (web uri) module has better support for RFC 3986
The URI standard, RFC 3986, defines additional "relative-ref" and
"URI-reference" data types. Thanks to Daniel Hartwig, Guile's support
for these URI subtypes has been improved. See "Universal Resource
Identifiers" in the manual, for more.
* New deprecations
** Using `uri?' as a predicate on relative-refs deprecated
If you don't care whether the URI is a relative-ref or not, use
`uri-reference?'. If you do, use `uri-reference?' and `relative-ref?'.
In the future `uri?' will return a true value only for URIs that specify
a scheme.
* Bug fixes * Bug fixes
** Enable GNU Readline 7.0's support for "bracketed paste". ** Enable GNU Readline 7.0's support for "bracketed paste".

View file

@ -173,23 +173,13 @@ Guile provides a standard data type for Universal Resource Identifiers
The generic URI syntax is as follows: The generic URI syntax is as follows:
@example @example
URI := scheme ":" ["//" [userinfo "@@"] host [":" port]] path \ URI-reference := [scheme ":"] ["//" [userinfo "@@"] host [":" port]] path \
[ "?" query ] [ "#" fragment ] [ "?" query ] [ "#" fragment ]
@end example @end example
For example, in the URI, @indicateurl{http://www.gnu.org/help/}, the For example, in the URI, @indicateurl{http://www.gnu.org/help/}, the
scheme is @code{http}, the host is @code{www.gnu.org}, the path is scheme is @code{http}, the host is @code{www.gnu.org}, the path is
@code{/help/}, and there is no userinfo, port, query, or fragment. All @code{/help/}, and there is no userinfo, port, query, or fragment.
URIs have a scheme and a path (though the path might be empty). Some
URIs have a host, and some of those have ports and userinfo. Any URI
might have a query part or a fragment.
There is also a ``URI-reference'' data type, which is the same as a URI
but where the scheme is optional. In this case, the scheme is taken to
be relative to some other related URI. A common use of URI references
is when you want to be vague regarding the choice of HTTP or HTTPS --
serving a web page referring to @code{/foo.css} will use HTTPS if loaded
over HTTPS, or HTTP otherwise.
Userinfo is something of an abstraction, as some legacy URI schemes Userinfo is something of an abstraction, as some legacy URI schemes
allowed userinfo of the form @code{@var{username}:@var{passwd}}. But allowed userinfo of the form @code{@var{username}:@var{passwd}}. But
@ -197,14 +187,6 @@ since passwords do not belong in URIs, the RFC does not want to condone
this practice, so it calls anything before the @code{@@} sign this practice, so it calls anything before the @code{@@} sign
@dfn{userinfo}. @dfn{userinfo}.
Properly speaking, a fragment is not part of a URI. For example, when a
web browser follows a link to @indicateurl{http://example.com/#foo}, it
sends a request for @indicateurl{http://example.com/}, then looks in the
resulting page for the fragment identified @code{foo} reference. A
fragment identifies a part of a resource, not the resource itself. But
it is useful to have a fragment field in the URI record itself, so we
hope you will forgive the inconsistency.
@example @example
(use-modules (web uri)) (use-modules (web uri))
@end example @end example
@ -213,40 +195,36 @@ The following procedures can be found in the @code{(web uri)}
module. Load it into your Guile, using a form like the above, to have module. Load it into your Guile, using a form like the above, to have
access to them. access to them.
The most common way to build a URI from Scheme is with the
@code{build-uri} function.
@deffn {Scheme Procedure} build-uri scheme @ @deffn {Scheme Procedure} build-uri scheme @
[#:userinfo=@code{#f}] [#:host=@code{#f}] [#:port=@code{#f}] @ [#:userinfo=@code{#f}] [#:host=@code{#f}] [#:port=@code{#f}] @
[#:path=@code{""}] [#:query=@code{#f}] [#:fragment=@code{#f}] @ [#:path=@code{""}] [#:query=@code{#f}] [#:fragment=@code{#f}] @
[#:validate?=@code{#t}] [#:validate?=@code{#t}]
Construct a URI object. @var{scheme} should be a symbol, @var{port} Construct a URI. @var{scheme} should be a symbol, @var{port} either a
either a positive, exact integer or @code{#f}, and the rest of the positive, exact integer or @code{#f}, and the rest of the fields are
fields are either strings or @code{#f}. If @var{validate?} is true, either strings or @code{#f}. If @var{validate?} is true, also run some
also run some consistency checks to make sure that the constructed URI consistency checks to make sure that the constructed URI is valid.
is valid.
@end deffn @end deffn
@deffn {Scheme Procedure} build-uri-reference [#:scheme=@code{#f}]@
[#:userinfo=@code{#f}] [#:host=@code{#f}] [#:port=@code{#f}] @
[#:path=@code{""}] [#:query=@code{#f}] [#:fragment=@code{#f}] @
[#:validate?=@code{#t}]
Like @code{build-uri}, but with an optional scheme.
@end deffn
In Guile, both URI and URI reference data types are represented in the
same way, as URI objects.
@deffn {Scheme Procedure} uri? obj @deffn {Scheme Procedure} uri? obj
@deffnx {Scheme Procedure} uri-scheme uri Return @code{#t} if @var{obj} is a URI.
@end deffn
Guile, URIs are represented as URI records, with a number of associated
accessors.
@deffn {Scheme Procedure} uri-scheme uri
@deffnx {Scheme Procedure} uri-userinfo uri @deffnx {Scheme Procedure} uri-userinfo uri
@deffnx {Scheme Procedure} uri-host uri @deffnx {Scheme Procedure} uri-host uri
@deffnx {Scheme Procedure} uri-port uri @deffnx {Scheme Procedure} uri-port uri
@deffnx {Scheme Procedure} uri-path uri @deffnx {Scheme Procedure} uri-path uri
@deffnx {Scheme Procedure} uri-query uri @deffnx {Scheme Procedure} uri-query uri
@deffnx {Scheme Procedure} uri-fragment uri @deffnx {Scheme Procedure} uri-fragment uri
A predicate and field accessors for the URI record type. The URI scheme Field accessors for the URI record type. The URI scheme will be a
will be a symbol, or @code{#f} if the object is a URI reference but not symbol, or @code{#f} if the object is a relative-ref (see below). The
a URI. The port will be either a positive, exact integer or @code{#f}, port will be either a positive, exact integer or @code{#f}, and the rest
and the rest of the fields will be either strings or @code{#f} if not of the fields will be either strings or @code{#f} if not present.
present.
@end deffn @end deffn
@deffn {Scheme Procedure} string->uri string @deffn {Scheme Procedure} string->uri string
@ -254,15 +232,11 @@ Parse @var{string} into a URI object. Return @code{#f} if the string
could not be parsed. could not be parsed.
@end deffn @end deffn
@deffn {Scheme Procedure} string->uri-reference string @deffn {Scheme Procedure} uri->string uri [#:include-fragment?=@code{#t}]
Parse @var{string} into a URI object, while not requiring a scheme.
Return @code{#f} if the string could not be parsed.
@end deffn
@deffn {Scheme Procedure} uri->string uri
Serialize @var{uri} to a string. If the URI has a port that is the Serialize @var{uri} to a string. If the URI has a port that is the
default port for its scheme, the port is not included in the default port for its scheme, the port is not included in the
serialization. serialization. If @var{include-fragment?} is given as false, the
resulting string will omit the fragment (if any).
@end deffn @end deffn
@deffn {Scheme Procedure} declare-default-port! scheme port @deffn {Scheme Procedure} declare-default-port! scheme port
@ -323,6 +297,70 @@ For example, the list @code{("scrambled eggs" "biscuits&gravy")} encodes
as @code{"scrambled%20eggs/biscuits%26gravy"}. as @code{"scrambled%20eggs/biscuits%26gravy"}.
@end deffn @end deffn
@subsubheading Subtypes of URI
As we noted above, not all URI objects have a scheme. You might have
noted in the ``generic URI syntax'' example that the left-hand side of
that grammar definition was URI-reference, not URI. A
@dfn{URI-reference} is a generalization of a URI where the scheme is
optional. If no scheme is specified, it is taken to be relative to some
other related URI. A common use of URI references is when you want to
be vague regarding the choice of HTTP or HTTPS -- serving a web page
referring to @code{/foo.css} will use HTTPS if loaded over HTTPS, or
HTTP otherwise.
@deffn {Scheme Procedure} build-uri-reference [#:scheme=@code{#f}]@
[#:userinfo=@code{#f}] [#:host=@code{#f}] [#:port=@code{#f}] @
[#:path=@code{""}] [#:query=@code{#f}] [#:fragment=@code{#f}] @
[#:validate?=@code{#t}]
Like @code{build-uri}, but with an optional scheme.
@end deffn
@deffn {Scheme Procedure} uri-reference? obj
Return @code{#t} if @var{obj} is a URI-reference. This is the most
general URI predicate, as it includes not only full URIs that have
schemes (those that match @code{uri?}) but also URIs without schemes.
@end deffn
It's also possible to build a @dfn{relative-ref}: a URI-reference that
explicitly lacks a scheme.
@deffn {Scheme Procedure} build-relative-ref @
[#:userinfo=@code{#f}] [#:host=@code{#f}] [#:port=@code{#f}] @
[#:path=@code{""}] [#:query=@code{#f}] [#:fragment=@code{#f}] @
[#:validate?=@code{#t}]
Like @code{build-uri}, but with no scheme.
@end deffn
@deffn {Scheme Procedure} relative-ref? obj
Return @code{#t} if @var{obj} is a ``relative-ref'': a URI-reference
that has no scheme. Every URI-reference will either match @code{uri?}
or @code{relative-ref?} (but not both).
@end deffn
In case it's not clear from the above, the most general of these URI
types is the URI-reference, with @code{build-uri-reference} as the most
general constructor. @code{build-uri} and @code{build-relative-ref}
enforce enforce specific restrictions on the URI-reference. The most
generic URI parser is then @code{string->uri-reference}, and there is
also a parser for when you know that you want a relative-ref.
@deffn {Scheme Procedure} string->uri-reference string
Parse @var{string} into a URI object, while not requiring a scheme.
Return @code{#f} if the string could not be parsed.
@end deffn
@deffn {Scheme Procedure} string->relative-ref string
Parse @var{string} into a URI object, while asserting that no scheme is
present. Return @code{#f} if the string could not be parsed.
@end deffn
For compatibility reasons, note that @code{uri?} will return @code{#t}
for all URI objects, even relative-refs. In contrast, @code{build-uri}
and @code{string->uri} require that the resulting URI not be a
relative-ref. As a predicate to distinguish relative-refs from proper
URIs (in the language of RFC 3986), use something like @code{(and
(uri-reference? @var{x}) (not (relative-ref? @var{x})))}.
@node HTTP @node HTTP
@subsection The Hyper-Text Transfer Protocol @subsection The Hyper-Text Transfer Protocol

View file

@ -164,16 +164,16 @@ host name without trailing dot."
get-position set-position! get-position set-position!
close)))) close))))
(define (ensure-uri uri-or-string) (define (ensure-uri-reference uri-or-string)
(cond (cond
((string? uri-or-string) (string->uri uri-or-string)) ((string? uri-or-string) (string->uri-reference uri-or-string))
((uri? uri-or-string) uri-or-string) ((uri-reference? uri-or-string) uri-or-string)
(else (error "Invalid URI" uri-or-string)))) (else (error "Invalid URI-reference" uri-or-string))))
(define (open-socket-for-uri uri-or-string) (define (open-socket-for-uri uri-or-string)
"Return an open input/output port for a connection to URI." "Return an open input/output port for a connection to URI."
(define http-proxy (current-http-proxy)) (define http-proxy (current-http-proxy))
(define uri (ensure-uri (or http-proxy uri-or-string))) (define uri (ensure-uri-reference (or http-proxy uri-or-string)))
(define addresses (define addresses
(let ((port (uri-port uri))) (let ((port (uri-port uri)))
(delete-duplicates (delete-duplicates
@ -344,7 +344,7 @@ as is the case by default with a request returned by `build-request'."
(streaming? #f) (streaming? #f)
(request (request
(build-request (build-request
(ensure-uri uri) (ensure-uri-reference uri)
#:method method #:method method
#:version version #:version version
#:headers (if keep-alive? #:headers (if keep-alive?

View file

@ -1112,7 +1112,8 @@ symbol, like GET."
(define* (parse-request-uri str #:optional (start 0) (end (string-length str))) (define* (parse-request-uri str #:optional (start 0) (end (string-length str)))
"Parse a URI from an HTTP request line. Note that URIs in requests do "Parse a URI from an HTTP request line. Note that URIs in requests do
not have to have a scheme or host name. The result is a URI object." not have to have a scheme or host name. The result is a URI-reference
object."
(cond (cond
((= start end) ((= start end)
(bad-request "Missing Request-URI")) (bad-request "Missing Request-URI"))
@ -1122,10 +1123,10 @@ not have to have a scheme or host name. The result is a URI object."
(let* ((q (string-index str #\? start end)) (let* ((q (string-index str #\? start end))
(f (string-index str #\# start end)) (f (string-index str #\# start end))
(q (and q (or (not f) (< q f)) q))) (q (and q (or (not f) (< q f)) q)))
(build-uri 'http (build-uri-reference
#:path (substring str start (or q f end)) #:path (substring str start (or q f end))
#:query (and q (substring str (1+ q) (or f end))) #:query (and q (substring str (1+ q) (or f end)))
#:fragment (and f (substring str (1+ f) end))))) #:fragment (and f (substring str (1+ f) end)))))
(else (else
(or (string->uri (substring str start end)) (or (string->uri (substring str start end))
(bad-request "Invalid URI: ~a" (substring str start end)))))) (bad-request "Invalid URI: ~a" (substring str start end))))))
@ -1143,31 +1144,7 @@ three values: the method, the URI, and the version."
(parse-http-version line (1+ d1) (string-length line))))) (parse-http-version line (1+ d1) (string-length line)))))
(define (write-uri uri port) (define (write-uri uri port)
(when (uri-host uri) (put-string port (uri->string uri #:include-fragment? #f)))
(when (uri-scheme uri)
(put-symbol port (uri-scheme uri))
(put-char port #\:))
(put-string port "//")
(when (uri-userinfo uri)
(put-string port (uri-userinfo uri))
(put-char port #\@))
(put-string port (uri-host uri))
(let ((p (uri-port uri)))
(when (and p (not (eqv? p 80)))
(put-char port #\:)
(put-non-negative-integer port p))))
(let* ((path (uri-path uri))
(len (string-length path)))
(cond
((and (> len 0) (not (eqv? (string-ref path 0) #\/)))
(bad-request "Non-absolute URI path: ~s" path))
((and (zero? len) (not (uri-host uri)))
(bad-request "Empty path and no host for URI: ~s" uri))
(else
(put-string port path))))
(when (uri-query uri)
(put-char port #\?)
(put-string port (uri-query uri))))
(define (write-request-line method uri version port) (define (write-request-line method uri version port)
"Write the first line of an HTTP request to PORT." "Write the first line of an HTTP request to PORT."
@ -1272,20 +1249,13 @@ treated specially, and is just returned as a plain string."
parse-non-negative-integer non-negative-integer? parse-non-negative-integer non-negative-integer?
(lambda (val port) (put-non-negative-integer port val)))) (lambda (val port) (put-non-negative-integer port val))))
;; emacs: (put 'declare-uri-header! 'scheme-indent-function 1)
(define (declare-uri-header! name)
(declare-header! name
(lambda (str) (or (string->uri str) (bad-header-component 'uri str)))
(@@ (web uri) absolute-uri?)
write-uri))
;; emacs: (put 'declare-uri-reference-header! 'scheme-indent-function 1) ;; emacs: (put 'declare-uri-reference-header! 'scheme-indent-function 1)
(define (declare-uri-reference-header! name) (define (declare-uri-reference-header! name)
(declare-header! name (declare-header! name
(lambda (str) (lambda (str)
(or (string->uri-reference str) (or (string->uri-reference str)
(bad-header-component 'uri str))) (bad-header-component 'uri-reference str)))
uri? uri-reference?
write-uri)) write-uri))
;; emacs: (put 'declare-quality-list-header! 'scheme-indent-function 1) ;; emacs: (put 'declare-quality-list-header! 'scheme-indent-function 1)

View file

@ -170,7 +170,7 @@ the headers are each run through their respective validators."
(non-negative-integer? (car version)) (non-negative-integer? (car version))
(non-negative-integer? (cdr version)))) (non-negative-integer? (cdr version))))
(bad-request "Bad version: ~a" version)) (bad-request "Bad version: ~a" version))
((not (uri? uri)) ((not (uri-reference? uri))
(bad-request "Bad uri: ~a" uri)) (bad-request "Bad uri: ~a" uri))
((and (not port) (memq method '(POST PUT))) ((and (not port) (memq method '(POST PUT)))
(bad-request "Missing port for message ~a" method)) (bad-request "Missing port for message ~a" method))

View file

@ -42,11 +42,15 @@
uri->string uri->string
uri-decode uri-encode uri-decode uri-encode
split-and-decode-uri-path split-and-decode-uri-path
encode-and-join-uri-path)) encode-and-join-uri-path
uri-reference? relative-ref?
build-uri-reference build-relative-ref
string->uri-reference string->relative-ref))
(define-record-type <uri> (define-record-type <uri>
(make-uri scheme userinfo host port path query fragment) (make-uri scheme userinfo host port path query fragment)
uri? uri-reference?
(scheme uri-scheme) (scheme uri-scheme)
(userinfo uri-userinfo) (userinfo uri-userinfo)
(host uri-host) (host uri-host)
@ -55,8 +59,49 @@
(query uri-query) (query uri-query)
(fragment uri-fragment)) (fragment uri-fragment))
(define (absolute-uri? obj) ;;;
(and (uri? obj) (uri-scheme obj) #t)) ;;; Predicates.
;;;
;;; These are quick, and assume rigid validation at construction time.
;;; RFC 3986, #3.
;;;
;;; URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ]
;;;
;;; hier-part = "//" authority path-abempty
;;; / path-absolute
;;; / path-rootless
;;; / path-empty
(define (uri? obj)
(and (uri-reference? obj)
(if (include-deprecated-features)
(begin
(unless (uri-scheme obj)
(issue-deprecation-warning
"Use uri-reference? instead of uri?; in the future, uri?
will require that the object not be a relative-ref."))
#t)
(uri-scheme obj))
#t))
;;; RFC 3986, #4.2.
;;;
;;; relative-ref = relative-part [ "?" query ] [ "#" fragment ]
;;;
;;; relative-part = "//" authority path-abempty
;;; / path-absolute
;;; / path-noscheme
;;; / path-empty
(define (relative-ref? obj)
(and (uri-reference? obj)
(not (uri-scheme obj))))
;;;
;;; Constructors.
;;;
(define (uri-error message . args) (define (uri-error message . args)
(throw 'uri-error message args)) (throw 'uri-error message args))
@ -64,10 +109,9 @@
(define (positive-exact-integer? port) (define (positive-exact-integer? port)
(and (number? port) (exact? port) (integer? port) (positive? port))) (and (number? port) (exact? port) (integer? port) (positive? port)))
(define* (validate-uri scheme userinfo host port path query fragment (define (validate-uri-reference scheme userinfo host port path query fragment)
#:key reference?)
(cond (cond
((and (not reference?) (not (symbol? scheme))) ((and scheme (not (symbol? scheme)))
(uri-error "Expected a symbol for the URI scheme: ~s" scheme)) (uri-error "Expected a symbol for the URI scheme: ~s" scheme))
((and (or userinfo port) (not host)) ((and (or userinfo port) (not host))
(uri-error "Expected a host, given userinfo or port")) (uri-error "Expected a host, given userinfo or port"))
@ -79,32 +123,65 @@
(uri-error "Expected string for userinfo: ~s" userinfo)) (uri-error "Expected string for userinfo: ~s" userinfo))
((not (string? path)) ((not (string? path))
(uri-error "Expected string for path: ~s" path)) (uri-error "Expected string for path: ~s" path))
((and host (not (string-null? path)) ((and query (not (string? query)))
(not (eqv? (string-ref path 0) #\/))) (uri-error "Expected string for query: ~s" query))
(uri-error "Expected path of absolute URI to start with a /: ~a" path)))) ((and fragment (not (string? fragment)))
(uri-error "Expected string for fragment: ~s" fragment))
;; Strict validation of allowed paths, based on other components.
;; Refer to RFC 3986 for the details.
((not (string-null? path))
(if host
(cond
((not (eqv? (string-ref path 0) #\/))
(uri-error
"Expected absolute path starting with \"/\": ~a" path)))
(cond
((string-prefix? "//" path)
(uri-error
"Expected path not starting with \"//\" (no host): ~a" path))
((and (not scheme)
(not (eqv? (string-ref path 0) #\/))
(let ((colon (string-index path #\:)))
(and colon (not (string-index path #\/ 0 colon)))))
(uri-error
"Expected relative path's first segment without \":\": ~a"
path)))))))
(define* (build-uri scheme #:key userinfo host port (path "") query fragment (define* (build-uri scheme #:key userinfo host port (path "") query fragment
(validate? #t)) (validate? #t))
"Construct a URI object. SCHEME should be a symbol, PORT either a "Construct a URI object. SCHEME should be a symbol, PORT either a
positive, exact integer or #f, and the rest of the fields are either positive, exact integer or #f, and the rest of the fields are either
strings or #f. If VALIDATE? is true, also run some consistency checks strings or #f. If VALIDATE? is true, also run some consistency checks
to make sure that the constructed object is a valid absolute URI." to make sure that the constructed object is a valid URI."
(if validate? (when validate?
(validate-uri scheme userinfo host port path query fragment)) (unless scheme (uri-error "Missing URI scheme"))
(validate-uri-reference scheme userinfo host port path query fragment))
(make-uri scheme userinfo host port path query fragment)) (make-uri scheme userinfo host port path query fragment))
(define* (build-uri-reference #:key scheme userinfo host port (path "") query (define* (build-uri-reference #:key scheme userinfo host port (path "") query
fragment (validate? #t)) fragment (validate? #t))
"Construct a URI object. SCHEME should be a symbol or #f, PORT "Construct a URI-reference object. SCHEME should be a symbol or #f,
either a positive, exact integer or #f, and the rest of the fields PORT either a positive, exact integer or #f, and the rest of the
are either strings or #f. If VALIDATE? is true, also run some fields are either strings or #f. If VALIDATE? is true, also run some
consistency checks to make sure that the constructed URI is a valid URI consistency checks to make sure that the constructed URI is a valid URI
reference (either an absolute URI or a relative reference)." reference."
(if validate? (when validate?
(validate-uri scheme userinfo host port path query fragment (validate-uri-reference scheme userinfo host port path query fragment))
#:reference? #t))
(make-uri scheme userinfo host port path query fragment)) (make-uri scheme userinfo host port path query fragment))
(define* (build-relative-ref #:key userinfo host port (path "") query fragment
(validate? #t))
"Construct a relative-ref URI object. The arguments are the same as
for build-uri except there is no scheme."
(when validate?
(validate-uri-reference #f userinfo host port path query fragment))
(make-uri #f userinfo host port path query fragment))
;;;
;;; Converters.
;;;
;; See RFC 3986 #3.2.2 for comments on percent-encodings, IDNA (RFC ;; See RFC 3986 #3.2.2 for comments on percent-encodings, IDNA (RFC
;; 3490), and non-ASCII host names. ;; 3490), and non-ASCII host names.
;; ;;
@ -192,16 +269,24 @@ reference (either an absolute URI or a relative reference)."
(make-regexp uri-pat)) (make-regexp uri-pat))
(define (string->uri-reference string) (define (string->uri-reference string)
"Parse the URI reference written as STRING into a URI object. Return "Parse STRING into a URI-reference object. Return #f if the string
#f if the string could not be parsed." could not be parsed."
(% (let ((m (regexp-exec uri-regexp string))) (% (let ((m (regexp-exec uri-regexp string)))
(if (not m) (abort)) (unless m (abort))
(let ((scheme (let ((str (match:substring m 2))) (let ((scheme (let ((str (match:substring m 2)))
(and str (string->symbol (string-downcase str))))) (and str (string->symbol (string-downcase str)))))
(authority (match:substring m 3)) (authority (match:substring m 3))
(path (match:substring m 4)) (path (match:substring m 4))
(query (match:substring m 6)) (query (match:substring m 6))
(fragment (match:substring m 8))) (fragment (match:substring m 8)))
;; The regular expression already ensures all of the validation
;; requirements for URI-references, except the one that the
;; first component of a relative-ref's path can't contain a
;; colon.
(unless scheme
(let ((colon (string-index path #\:)))
(when (and colon (not (string-index path #\/ 0 colon)))
(abort))))
(call-with-values (call-with-values
(lambda () (lambda ()
(if authority (if authority
@ -213,10 +298,19 @@ reference (either an absolute URI or a relative reference)."
#f))) #f)))
(define (string->uri string) (define (string->uri string)
"Parse STRING into an absolute URI object. Return #f if the string "Parse STRING into a URI object. Return #f if the string could not
could not be parsed." be parsed. Note that this procedure will require that the URI have a
(let ((uri (string->uri-reference string))) scheme."
(and uri (uri-scheme uri) uri))) (let ((uri-reference (string->uri-reference string)))
(and (not (relative-ref? uri-reference))
uri-reference)))
(define (string->relative-ref string)
"Parse STRING into a relative-ref URI object. Return #f if the
string could not be parsed."
(let ((uri-reference (string->uri-reference string)))
(and (relative-ref? uri-reference)
uri-reference)))
(define *default-ports* (make-hash-table)) (define *default-ports* (make-hash-table))
@ -231,7 +325,7 @@ could not be parsed."
(declare-default-port! 'http 80) (declare-default-port! 'http 80)
(declare-default-port! 'https 443) (declare-default-port! 'https 443)
(define (uri->string uri) (define* (uri->string uri #:key (include-fragment? #t))
"Serialize URI to a string. If the URI has a port that is the "Serialize URI to a string. If the URI has a port that is the
default port for its scheme, the port is not included in the default port for its scheme, the port is not included in the
serialization." serialization."
@ -261,7 +355,7 @@ serialization."
(if query (if query
(string-append "?" query) (string-append "?" query)
"") "")
(if fragment (if (and fragment include-fragment?)
(string-append "#" fragment) (string-append "#" fragment)
"")))) ""))))

View file

@ -1,6 +1,6 @@
;;;; web-http.test --- HTTP library -*- mode: scheme; coding: utf-8; -*- ;;;; web-http.test --- HTTP library -*- mode: scheme; coding: utf-8; -*-
;;;; ;;;;
;;;; Copyright (C) 2010-2011, 2014-2016 Free Software Foundation, Inc. ;;;; Copyright (C) 2010-2011, 2014-2017 Free Software Foundation, Inc.
;;;; ;;;;
;;;; This library is free software; you can redistribute it and/or ;;;; This library is free software; you can redistribute it and/or
;;;; modify it under the terms of the GNU Lesser General Public ;;;; modify it under the terms of the GNU Lesser General Public
@ -150,32 +150,33 @@
(with-test-prefix "read-request-line" (with-test-prefix "read-request-line"
(pass-if-read-request-line "GET / HTTP/1.1" (pass-if-read-request-line "GET / HTTP/1.1"
GET GET
(build-uri 'http (build-uri-reference
#:path "/") #:path "/")
(1 . 1)) (1 . 1))
(pass-if-read-request-line "GET http://www.w3.org/pub/WWW/TheProject.html HTTP/1.1" (pass-if-read-request-line "GET http://www.w3.org/pub/WWW/TheProject.html HTTP/1.1"
GET GET
(build-uri 'http (build-uri-reference
#:host "www.w3.org" #:scheme 'http
#:path "/pub/WWW/TheProject.html") #:host "www.w3.org"
#:path "/pub/WWW/TheProject.html")
(1 . 1)) (1 . 1))
(pass-if-read-request-line "GET /pub/WWW/TheProject.html HTTP/1.1" (pass-if-read-request-line "GET /pub/WWW/TheProject.html HTTP/1.1"
GET GET
(build-uri 'http (build-uri-reference
#:path "/pub/WWW/TheProject.html") #:path "/pub/WWW/TheProject.html")
(1 . 1)) (1 . 1))
(pass-if-read-request-line "HEAD /etc/hosts?foo=bar HTTP/1.1" (pass-if-read-request-line "HEAD /etc/hosts?foo=bar HTTP/1.1"
HEAD HEAD
(build-uri 'http (build-uri-reference
#:path "/etc/hosts" #:path "/etc/hosts"
#:query "foo=bar") #:query "foo=bar")
(1 . 1))) (1 . 1)))
(with-test-prefix "write-request-line" (with-test-prefix "write-request-line"
(pass-if-write-request-line "GET / HTTP/1.1" (pass-if-write-request-line "GET / HTTP/1.1"
GET GET
(build-uri 'http (build-uri-reference
#:path "/") #:path "/")
(1 . 1)) (1 . 1))
;;; FIXME: Test fails due to scheme, host always being removed. ;;; FIXME: Test fails due to scheme, host always being removed.
;;; However, it should be supported to request these be present, and ;;; However, it should be supported to request these be present, and
@ -188,8 +189,8 @@
;; (1 . 1)) ;; (1 . 1))
(pass-if-write-request-line "GET /pub/WWW/TheProject.html HTTP/1.1" (pass-if-write-request-line "GET /pub/WWW/TheProject.html HTTP/1.1"
GET GET
(build-uri 'http (build-uri-reference
#:path "/pub/WWW/TheProject.html") #:path "/pub/WWW/TheProject.html")
(1 . 1)) (1 . 1))
(pass-if-write-request-line "GET /?foo HTTP/1.1" (pass-if-write-request-line "GET /?foo HTTP/1.1"
GET GET
@ -197,9 +198,9 @@
(1 . 1)) (1 . 1))
(pass-if-write-request-line "HEAD /etc/hosts?foo=bar HTTP/1.1" (pass-if-write-request-line "HEAD /etc/hosts?foo=bar HTTP/1.1"
HEAD HEAD
(build-uri 'http (build-uri-reference
#:path "/etc/hosts" #:path "/etc/hosts"
#:query "foo=bar") #:query "foo=bar")
(1 . 1))) (1 . 1)))
(with-test-prefix "read-response-line" (with-test-prefix "read-response-line"
@ -298,6 +299,12 @@
(pass-if-parse content-length "010" 10) (pass-if-parse content-length "010" 10)
(pass-if-parse content-location "http://foo/" (pass-if-parse content-location "http://foo/"
(build-uri 'http #:host "foo" #:path "/")) (build-uri 'http #:host "foo" #:path "/"))
(pass-if-parse content-location "//foo/"
(build-uri-reference #:host "foo" #:path "/"))
(pass-if-parse content-location "/etc/foo"
(build-uri-reference #:path "/etc/foo"))
(pass-if-parse content-location "foo"
(build-uri-reference #:path "foo"))
(pass-if-parse content-range "bytes 10-20/*" '(bytes (10 . 20) *)) (pass-if-parse content-range "bytes 10-20/*" '(bytes (10 . 20) *))
(pass-if-parse content-range "bytes */*" '(bytes * *)) (pass-if-parse content-range "bytes */*" '(bytes * *))
(pass-if-parse content-range "bytes */30" '(bytes * 30)) (pass-if-parse content-range "bytes */30" '(bytes * 30))
@ -370,6 +377,14 @@
(pass-if-parse range "bytes=-20,-30" '(bytes (#f . 20) (#f . 30))) (pass-if-parse range "bytes=-20,-30" '(bytes (#f . 20) (#f . 30)))
(pass-if-parse referer "http://foo/bar?baz" (pass-if-parse referer "http://foo/bar?baz"
(build-uri 'http #:host "foo" #:path "/bar" #:query "baz")) (build-uri 'http #:host "foo" #:path "/bar" #:query "baz"))
(pass-if-parse referer "//foo/bar?baz"
(build-uri-reference #:host "foo"
#:path "/bar"
#:query "baz"))
(pass-if-parse referer "/etc/foo"
(build-uri-reference #:path "/etc/foo"))
(pass-if-parse referer "foo"
(build-uri-reference #:path "foo"))
(pass-if-parse te "trailers" '((trailers))) (pass-if-parse te "trailers" '((trailers)))
(pass-if-parse te "trailers,foo" '((trailers) (foo))) (pass-if-parse te "trailers,foo" '((trailers) (foo)))
(pass-if-parse user-agent "guile" "guile")) (pass-if-parse user-agent "guile" "guile"))

View file

@ -1,6 +1,6 @@
;;;; web-request.test --- HTTP requests -*- mode: scheme; coding: utf-8; -*- ;;;; web-request.test --- HTTP requests -*- mode: scheme; coding: utf-8; -*-
;;;; ;;;;
;;;; Copyright (C) 2010, 2011 Free Software Foundation, Inc. ;;;; Copyright (C) 2010, 2011, 2013 Free Software Foundation, Inc.
;;;; ;;;;
;;;; This library is free software; you can redistribute it and/or ;;;; This library is free software; you can redistribute it and/or
;;;; modify it under the terms of the GNU Lesser General Public ;;;; modify it under the terms of the GNU Lesser General Public
@ -53,7 +53,8 @@ Accept-Language: en-gb, en;q=0.9\r
(pass-if (equal? (request-method r) 'GET)) (pass-if (equal? (request-method r) 'GET))
(pass-if (equal? (request-uri r) (build-uri 'http #:path "/qux"))) (pass-if (equal? (request-uri r)
(build-uri-reference #:path "/qux")))
(pass-if (equal? (read-request-body r) #f)) (pass-if (equal? (read-request-body r) #f))

View file

@ -1,6 +1,6 @@
;;;; web-uri.test --- URI library -*- mode: scheme; coding: utf-8; -*- ;;;; web-uri.test --- URI library -*- mode: scheme; coding: utf-8; -*-
;;;; ;;;;
;;;; Copyright (C) 2010, 2011, 2012, 2014 Free Software Foundation, Inc. ;;;; Copyright (C) 2010-2012, 2014, 2017 Free Software Foundation, Inc.
;;;; ;;;;
;;;; This library is free software; you can redistribute it and/or ;;;; This library is free software; you can redistribute it and/or
;;;; modify it under the terms of the GNU Lesser General Public ;;;; modify it under the terms of the GNU Lesser General Public
@ -27,7 +27,7 @@
(define* (uri=? uri #:key scheme userinfo host port path query fragment) (define* (uri=? uri #:key scheme userinfo host port path query fragment)
(and (uri? uri) (and (uri-reference? uri)
(equal? (uri-scheme uri) scheme) (equal? (uri-scheme uri) scheme)
(equal? (uri-userinfo uri) userinfo) (equal? (uri-userinfo uri) userinfo)
(equal? (uri-host uri) host) (equal? (uri-host uri) host)
@ -123,6 +123,22 @@
"Expected.*host" "Expected.*host"
(build-uri 'http #:userinfo "foo"))) (build-uri 'http #:userinfo "foo")))
(with-test-prefix "build-uri-reference"
(pass-if "//host/etc/foo"
(uri=? (build-uri-reference #:host "host"
#:path "/etc/foo")
#:host "host"
#:path "/etc/foo"))
(pass-if "/path/to/some/foo?query"
(uri=? (build-uri-reference #:path "/path/to/some/foo"
#:query "query")
#:path "/path/to/some/foo"
#:query "query"))
(pass-if "nextdoc/foo"
(uri=? (build-uri-reference #:path "nextdoc/foo")
#:path "nextdoc/foo")))
(with-test-prefix "string->uri" (with-test-prefix "string->uri"
(pass-if "ftp:" (pass-if "ftp:"
@ -503,6 +519,30 @@
#:query "q" #:query "q"
#:fragment "bar"))) #:fragment "bar")))
(with-test-prefix "string->uri-reference"
(pass-if "/"
(uri=? (string->uri-reference "/")
#:path "/"))
(pass-if "/path/to/foo"
(uri=? (string->uri-reference "/path/to/foo")
#:path "/path/to/foo"))
(pass-if "//example.org"
(uri=? (string->uri-reference "//example.org")
#:host "example.org"
#:path ""))
(pass-if "//bar@example.org/path/to/foo"
(uri=? (string->uri-reference "//bar@example.org/path/to/foo")
#:userinfo "bar"
#:host "example.org"
#:path "/path/to/foo"))
(pass-if "nextdoc/foo"
(uri=? (string->uri-reference "nextdoc/foo")
#:path "nextdoc/foo")))
(with-test-prefix "uri->string" (with-test-prefix "uri->string"
(pass-if "ftp:" (pass-if "ftp:"
(equal? "ftp:" (equal? "ftp:"
@ -587,7 +627,23 @@
(pass-if "foo/?bar#baz" (pass-if "foo/?bar#baz"
(equal? "foo/?bar#baz" (equal? "foo/?bar#baz"
(uri->string (string->uri-reference "foo/?bar#baz"))))) (uri->string (string->uri-reference "foo/?bar#baz"))))
(pass-if "/path/to/foo"
(equal? "/path/to/foo"
(uri->string (string->uri-reference "/path/to/foo"))))
(pass-if "//example.org"
(equal? "//example.org"
(uri->string (string->uri-reference "//example.org"))))
(pass-if "//bar@example.org/path/to/foo"
(equal? "//bar@example.org/path/to/foo"
(uri->string (string->uri-reference "//bar@example.org/path/to/foo"))))
(pass-if "nextdoc/foo"
(equal? "nextdoc/foo"
(uri->string (string->uri-reference "nextdoc/foo")))))
(with-test-prefix "decode" (with-test-prefix "decode"
(pass-if "foo%20bar" (pass-if "foo%20bar"