互联网分层模型

互联网的逻辑实现分为好几层,每一层都有自己的功能,就像房屋一样,每一层都靠下一层支持。用户接触到的只是最上面的一层。

如上图所示,互联网按照不同的模型会有不同的分层,在软件开发中一般以五层协议模型为主,下面我们粗略介绍一下五层模型。

1. 物理层
我们的电脑要与外界通信,首先得连接网络,我们可以用双绞线、光纤、无线电波等方式。这就叫做”实物理层”,它就是把电脑连接起来的物理手段。它主要规定了网络的一些电气特性,作用是负责传送0和1的电信号。

2.数据链路层
单纯的0和1没有任何意义,所以我们使用者会为其赋予一些特定的含义,规定解读电信号的方式:例如:多少个电信号算一组?每个信号位有何意义?这就是”数据链接层”的功能,它在”物理层”的上方,确定了物理层传输的0和1的分组方式及代表的意义。早期的时候,每家公司都有自己的电信号分组方式。逐渐地,一种叫做“以太网”(Ethernet)的协议,占据了主导地位。
以太网规定,一组电信号构成一个数据包,叫做”帧”(Frame)。每一帧分成两个部分:标头(Head)和数据(Data)。其中”标头”包含数据包的一些说明项,比如发送者、接受者、数据类型等等;”数据”则是数据包的具体内容。
那么,发送者和接受者是如何标识呢?以太网规定,连入网络的所有设备都必须具有”网卡”接口。数据包必须是从一块网卡,传送到另一块网卡。网卡的地址,就是数据包的发送地址和接收地址,这叫做MAC地址。每块网卡出厂的时候,都有一个全世界独一无二的MAC地址。
我们会通过ARP(Address Resolution Protocol)来获取接受方的MAC地址,有了MAC地址之后,如何把数据准确的发送给接收方呢?其实这里以太网采用了一种很“原始”的方式,它不是把数据包准确送到接收方,而是向本网络内所有计算机都发送,让每台计算机读取这个包的“标头”,找到接收方的MAC地址,然后与自身的MAC地址相比较,如果两者相同,就接受这个包,做进一步处理,否则就丢弃这个包。这种发送方式就叫做“广播”(broadcasting)。

3. 网络层⭐⭐
依靠以太网协议的规则我们可以依靠MAC地址来向外发送数据,但发送的数据只能局限在发送者所在的子网络。也就是说如果两台计算机不在同一个局域网,广播是传不过去的。
因此,必须找到一种方法区分哪些MAC地址属于同一个子网络,哪些不是。如果是同一个子网络,就采用广播方式发送,否则就采用”路由”方式发送。这就导致了”网络层”的诞生。它的作用是引进一套新的地址,使得我们能够区分不同的计算机是否属于同一个子网络。这套地址就叫做”网络地址”,简称”网址”。
“网络层”出现以后,每台计算机有了两种地址,一种是MAC地址,另一种是网络地址。两种地址之间没有任何联系,MAC地址是绑定在网卡上的,网络地址则是网络管理员分配的。网络地址帮助我们确定计算机所在的子网络,MAC地址则将数据包送到该子网络中的目标网卡。因此,从逻辑上可以推断,必定是先处理网络地址,然后再处理MAC地址。
规定网络地址的协议,叫做IP协议。它所定义的地址,就被称为IP地址。目前,广泛采用的是IP协议第四版,简称IPv4。IPv4这个版本规定,网络地址由32个二进制位组成,我们通常习惯用分成四段的十进制数表示IP地址,从0.0.0.0一直到255.255.255.255。
根据IP协议发送的数据,就叫做IP数据包。IP数据包也分为”标头”和”数据”两个部分:”标头”部分主要包括版本、长度、IP地址等信息,”数据”部分则是IP数据包的具体内容。IP数据包的”标头”部分的长度为20到60字节,整个数据包的总长度最大为65535字节。

4. 运输层⭐⭐⭐
有了MAC地址和IP地址,我们已经可以在互联网上任意两台主机上建立通信。但问题是同一台主机上会有许多程序都需要用网络收发数据,比如QQ和浏览器这两个程序都需要连接互联网并收发数据,我们如何区分某个数据包到底是归哪个程序的呢?也就是说,我们还需要一个参数,表示这个数据包到底供哪个程序(进程)使用。这个参数就叫做”端口”(port),它其实是每一个使用网卡的程序的编号。每个数据包都发到主机的特定端口,所以不同的程序就能取到自己所需要的数据。
“端口”是0到65535之间的一个整数,正好16个二进制位。0到1023的端口被系统占用,用户只能选用大于1023的端口。有了IP和端口我们就能实现唯一确定互联网上一个程序,进而实现网络间的程序通信。
我们必须在数据包中加入端口信息,这就需要新的协议。UDP和TCP是我们今后打交道最多的两个协议,也是webSocket编程的基础,后面我们会着重介绍。

5. 应用层⭐⭐
应用层是网络应用程序及它们的应用层协议存留的地方。应用层包括许多协议,比如HTTP(which provides for Web document request and transfer),SMTP (which provides for the transfer of e-mail messages), and FTP(which provides for the transfer of files between two end systems).An application-layer protocol is distributed over multiple end systems,with the application in one end system using tht protocol to exchange packets s of information with the application in another end system. We’ll refer to this packet of information at the application layer as a message(报文).

Application Layer

The Web and HTTP

Overview of HTTP

HTTP defines how Web clients request Web pages from Web servers and how servers transfer Web pages to clients.

HTTP uses TCP as its underlying transport protocol (rather than running on top of UDP). The HTTP client first initiates a TCP connection with the server. Once the connection is established, the browser and the server processes access TCP through their socket interfaces.
on the server side it is the door between the server process and the TCP connection. The client sends HTTP request messages into its socket interface and receives HTTP response messages from its socket interface. Similarly, the HTTP server receives request messages.

Non-Persistent and Persistent Connections

When this client-server interaction is taking place over TCP, the application developer needs to make an important decision—should each request/response pair be sent over a separate TCP connection, or should all of the requests and their corresponding responses be sent over the same TCP connection? In the former approach, the application is said to use non-persistent connections; and in the latter approach, persistent connections.

With non-persistent connection,a brand -new connection must be established and maintained for each requested object. For each of these connections, TCP buffers must be allocated and TCP variables must be kept in both the client and server. This can place a significant burden on the Web server, which may be serving requests from hundreds of different clients simultaneously.

With HTTP 1.1 persistent connections, the server leaves the TCP connection open after sending a response. Subsequent(后来的) requests and responses between the same client and server can be sent over the same connection. In particular, an entire Web page can be sent over a single persistent TCP connection. Moreover, multiple Web pages residing on the same server can be sent from the server to the same client over a single persistent TCP connection. These requests for objects can be made back-to-back, without waiting for replies to pending requests (pipelining). Typically, the HTTP server closes a connection when it isn’t used for a certain time (a configurable timeout interval). When the server receives the back-to-back requests, it sends the objects back-to-back.

The default mode of HTTP uses persistent connections with pipelining.

HTTP Message Format

HTTP Request Message

Below we provide a typical HTTP request message:

GET /index.php/archives/300/ HTTP/1.1
Host: theoyu.top
Connection: close
User-Agent: Mozilla/5.0  
Accept-Language: zh-CN,zh;q=0.8,zh-TW;q=0.7,zh-HK;q=0.5,en-US;q=0.3,en;q=0.2

we can see that the message is written in ordinary ASCII text, although this particular
request message has five lines(actually I deleted some lines), a request message can have many more lines or as few as one line.

The first line of an HTTP request message is called the request line; the subsequent lines are called the header lines. The request line has three fields: the method field, the URL field, and the HTTP version field. The method field can take on several different values, including GET, POST, HEAD,PUT, and DELETE.
The great majority of HTTP request messages use the GET method. The GET method is used when the browser requests an object, with the requested object identified in the URL field. In this example, the browser is requesting the object /index.php/archives/300/ . The version is selfexplanatory; in this example, the browser implements version HTTP/1.1.

The header line Host: theoyu.top specifies the host on which the object resides.
By including the Connection: close header line, the browser is telling the server that it doesn’t want to bother with persistent connections; it wants the server to close the connection after sending the requested object.
The Useragent: header line specifies the user agent, that is, the browser type that is making the request to the server. Here the user agent is Mozilla/5.0, a Firefox browser.
Finally, the Accept-language: header indicates that the user prefers to receive a 简体中文 version of the object.

However, that after the header lines (and the additional carriage return and line feed) there is an “entity body.” The entity body is empty with the GET method, but is used with the POST method. An HTTP client often uses the POST method when the user fills out a form—for example, when a user provides search words to a search engine. With a POST message, the user is still requesting a Web page from the server, but the specific contents of the Web page.

HTTP Response Message

Below we provide a typical HTTP response massage.

HTTP/1.1 200 OK
Connection: close
Date: Tue, 18 Aug 2015 15:44:04 GMT
Server: Apache/2.2.3 (CentOS)
Last-Modified: Tue, 18 Aug 2015 15:11:03 GMT
Content-Length: 6821
Content-Type: text/html

It has three sections: an initial status line, six header lines, and then the entity body.
The status line has three fields: the protocol version field, a status code, and a corresponding status message. In this example, the status line indicates that the server is using HTTP/1.1 and that everything is OK (that is, the server has found, and is sending, the requested object).
The header lines are similar as request's.

User-Server Interaction:Cookies

An HTTP server is stateless, But it is often desirable for a Web site to identify users, either because the server wishes to restrict user access or because it wants to serve content as a function of the user identity. For these purposes, HTTP uses cookies. Cookies, defined in [RFC 6265], allow sites to keep track of users. Most major commercial Web sites use cookies today.

As shown in the above diagram,cookie technology has four components:
(1) a cookie header line in the HTTP response message;
(2) a cookie header line in the HTTP request message;
(3) a cookie file kept on the user’s end system and managed by the user’s browser;
(4) a back-end database at the Web site.

Transport Layer

Network Layer

Link Layer

Physical Layer

最后修改:2021 年 02 月 21 日 01 : 25 PM