Working with System.Uri


Working with System.Uri

System.Uri is the .NET Framework class used to represent a URI. Using the System.Uri class, you can validate, parse, combine, and compare URIs. You construct an instance of System.Uri by supplying a string representation of the URI in the constructor.

C#

 try { Uriuri=newUri("http://www.contoso.com/list.htm#new"); Console.WriteLine(uri.ToString()); } catch(UriFormatExceptionuex) { Console.WriteLine(uex.ToString()); } 

Visual Basic .NET

 Try DimuriAsNewUri(http://www.contoso.com/list.htm#new) Console.WriteLine(uri.ToString) CatchuexAsUriFormatException Console.WriteLine(uex.ToString) EndTry 

The code in this sample constructs a new URI instance and displays the URI to the console. Although constructing a URI is a very simple task, there are a number of things to consider when working with a URI.

Canonicalization

Canonicalization is the process of converting a URI into its simplest form. This process is important because there are multiple ways to express a URI in its raw or string form that ultimately canonicalize into the same URI. Consider the following example:

C#

 try { //NotetherawURIsarealldifferent UriuriOne=newUri(http://www.contoso.com/Prodlist.htm); UriuriTwo=newUri(http://www.contoso.com:80/Prod%20list.htm); UriuriThree=newUri(http://www.contoso.com/Prod%20list.htm); //TheCanonicalrepresentationisthesameforallthree Console.WriteLine(uriOne=  +uriOne.ToString()); Console.WriteLine(uriTwo=  +uriTwo.ToString()); Console.WriteLine(uriThree=  +uriThree.ToString()); } catch(UriFormatExceptionuex) { Console.WriteLine(uex.ToString()); } 

Visual Basic .NET

 Try NotetherawURIsarealldifferent DimuriOneAsNewUri("http://www.contoso.com/Prodlist.htm") DimuriTwoAsNewUri("http://www.contoso.com:80/Prod%20list.htm") DimuriThreeAsNewUri("http://www.contoso.com/Prod%20list.htm") TheCanonicalrepresentationisthesameforallthree Console.WriteLine("uriOne= " +uriOne.ToString()) Console.WriteLine("uriTwo= " +uriTwo.ToString()) Console.WriteLine("uriThree= " +uriThree.ToString()) CatchuexAsUriFormatException Console.WriteLine(uex.ToString) EndTry 

In this example, the space can be either a literal value or an escaped form. Also, the :80 port number can be excluded in the canonical form because it s the default port for the scheme for this URI. The canonical representation of a URI can be obtained by calling the ToString method of System.Uri .

Comparing URIs

It s often useful to compare two URIs. However, it s important to understand that System.Uri compares URIs in their canonical form instead of in their raw form. Consider the following example:

C#

 try { //NotetherawURIsaredifferent UriuriOne=newUri(http://www.contoso.com/Prodlist.htm); UriuriTwo=newUri(http://www.contoso.com:80/Prod%20list.htm); //Comparisonisbasedonthecanonicalrepresentation //souriOneanduriTwowillbeequal. Console.WriteLine(uriOne.Equals(uriTwo)); } catch(UriFormatExceptionuex) { Console.WriteLine(uex.ToString()); } 

Visual Basic .NET

 Try NotetherawURIsaredifferent DimuriOneAsNewUri("http://www.contoso.com/Prodlist.htm") DimuriTwoAsNewUri("http://www.contoso.com:80/Prod%20list.htm") Comparisonisbasedonthecanonicalrepresentation souriOneanduriTwowillbeequal. Console.WriteLine(uriOne.Equals(uriTwo)) CatchuexAsUriFormatException Console.WriteLine(uex.ToString) EndTry 

Another interesting point to note is that because the fragment isn t considered part of the URI, it s omitted from the URI comparison in System.Uri . For example, a comparison of http://www.contoso.com/Prod list.htm and http://www.contoso.com/Prod list.htm#newItems will return true because #newItems is ignored.

Working with Schemes

As described earlier in this chapter, the scheme part of a URI is the element at the beginning of the URI that defines how the URI can be parsed and, in the case of a URL, resolved. Most schemes define a scheme-specific part that follows the general guidelines listed earlier in this chapter of having an authority, a path , and ( potentially ) a query component. However, schemes are not required to follow this pattern. In fact, some schemes define their own logic that does not correspond to these common parts . For example, consider the following URIs:

 http://www.contoso.com/Prodlist.htm mailto:cdo@contoso.com?meg=kate 

The first URI is an example of the HTTP scheme. It defines an authority ( www.contoso.com ) and a path ( Prodlist.htm ). The second URI is an example of the MAILTO scheme. MAILTO does not define authority and path components . Rather, it defines a to component and a headers component. In this example, the to value is cdo@contoso.com , the header name is meg , and the header value is kate .

In general, System.Uri will simply look for the colon to parse the scheme from the scheme-specific part. There is one exception to this rule that developers should understand. Because it s common for URIs of the file: scheme to be entered without the scheme, as in c:\test\test.htm , System.Uri supports the automatic conversion of local paths ( c:\test\test.htm ) to file: scheme URIs ( file:///c:/test/test.htm ). So, if you have a single character scheme, the System.Uri class will treat it as a file: scheme.

System.Uri has an in-depth understanding of a number of the most commonly used schemes so that it can take these special cases into account. The following list represents the schemes understood by System.Uri in version 1.1 of the .NET Framework:

  • FILE

  • HTTP

  • HTTPS

  • FTP

  • GOPHER

  • MAILTO

  • NEWS

  • NNTP

  • UUID

  • TELNET

  • LDAP

  • SOAP

Although this list is expected to grow over time, the fact that schemes can be defined at any time ensures that there will be cases in which System.Uri encounters a scheme that it does not recognize. In those cases, System.Uri will fall back to using parsing logic based on the general URI components described at the beginning of this chapter. If that URI scheme follows these general component recommendations, the URI will parse just fine. However, if that unknown scheme has defined its own scheme-specific part that does not follow the common pattern, such as with MAILTO , System.Uri does not have a way of knowing how to parse out the components and will throw a UriFormatException if it can t map the scheme into the common pattern. For example, consider the following example:

C#

 try { Console.WriteLine("Unknownschemegeneralpattern"); UriuriUnknown=newUri("unknown://authority/path?query"); Console.WriteLine("scheme:" +uriUnknown.Scheme); Console.WriteLine("authority:" +uriUnknown.Authority); Console.WriteLine("pathandquery:" +uriUnknown.PathAndQuery); Console.WriteLine(); Console.WriteLine("Unknownschemethatusesacustompattern"); UriuriUnknownCustom=newUri("unknown:path.authority.query"); Console.WriteLine("scheme:" +uriUnknownCustom.Scheme); Console.WriteLine("authority:" +uriUnknownCustom.Authority); Console.WriteLine("pathandquery:" +uriUnknownCustom.PathAndQuery); } catch(UriFormatExceptionuex) { Console.WriteLine(uex.ToString()); } 

Visual Basic .NET

 Try Console.WriteLine(Unknownschemegeneralpattern) DimuriUnknownAsNewUri(unknown://authority/path?query) Console.WriteLine(scheme: +uriUnknown.Scheme) Console.WriteLine(authority: +uriUnknown.Authority) Console.WriteLine(pathandquery: +uriUnknown.PathAndQuery) Console.WriteLine() Console.WriteLine(Unknownschemethatusesacustompattern) DimuriUnknownCustomAsNewUri(unknown:path.authority.query) Console.WriteLine(scheme: +uriUnknownCustom.Scheme) Console.WriteLine(authority: +uriUnknownCustom.Authority) Console.WriteLine(pathandquery: +uriUnknownCustom.PathAndQuery) CatchexAsException Console.WriteLine(ex.ToString) EndTry 

This sample outputs the following to the console:

 Unknownschemethatfollowsthegeneralpattern scheme:unknown authority:authority pathandquery:/path?query Unknownschemethatusesacustompattern scheme:unknown authority: pathandquery:path.authority.query 

In this sample, System.Uri is able to correctly parse the unknown scheme that uses the general scheme pattern. However, in the case where the scheme- specific part is based off a custom pattern, the authority is not parsed because the logic for parsing the component parts is not defined.

Note  

In version 1.1 of the .NET Framework, there s no way to specify custom parsing logic so that a URI scheme that does not follow the general URI pattern and is not known by System.Uri can plug in and provide its own parsing implementation. This lack of support for custom URI scheme parsing is expected to the change in the next major release of the .NET Framework. In general, if you have to create a new scheme, it s best to follow the general component syntax of scheme: //authority/path?query because most URI parsing libraries will understand the scheme.

Parsing Host Names

Although the concept of a host is not explicitly defined as part of the URI, a host name is often referenced as the authority portion of the URI. Therefore, you should consider the following points when dealing with host names in System.Uri , especially in the case of HTTP URIs:

  • System.Uri supports a fully qualified DNS name (FQDN), an IP address, or a machine name as the host name.

  • System.Uri always converts the host name to lowercase characters as part of parsing the URI.

  • Internet Protocol version 6 (IPv6) addresses should be entered inside square brackets for URI construction, for example, http://[::1]/path .

  • Internet Protocol version 4 (IPv4) addresses can be entered in their conventional dot-separated format, for example, http://127.0.0.1 .




Network Programming for the Microsoft. NET Framework
Network Programming for the MicrosoftВ® .NET Framework (Pro-Developer)
ISBN: 073561959X
EAN: 2147483647
Year: 2003
Pages: 121

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net