A clumsy Docker network notes

A clumsy Docker networking notes

Nguồn:

Khái niệm

Docker và network namespace

Ví dụ sau lấy theo các thông số, cấu hình mặc định khi chạy Docker container.

$ sudo ip netns list
# Start a container
$ docker run --name nstest -it busybox

# Open another terminal session
$ sudo ip netns list
# nothing return
# Get container process id.
$ pid="$(docker inspect -f '{{.State.Pid}}' nstest)"
# Soft link the network namespace
$ sudo mkdir -p /var/run/netns/
$ sudo ln -sf /proc/$pid/ns/net /var/run/netns/nstest
$ sudo ip netns list
nstest (id: 0)
# ^ this, it works.
$ docker exec nstest ip a
docker exec nstest ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
7: eth0@if8: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue
    link/ether 02:42:ac:11:00:02 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.2/16 brd 172.17.255.255 scope global eth0
       valid_lft forever preferred_lft forever

$ sudo ip netns exec nstest ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
7: eth0@if8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 02:42:ac:11:00:02 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 172.17.0.2/16 brd 172.17.255.255 scope global eth0
       valid_lft forever preferred_lft forever
# ^ result is the same as docker exec command.
$ ip a | grep docker
5: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
8: veth4866e91@if7: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP group default
$ brctl show
bridge name     bridge id               STP enabled     interfaces
docker0         8000.0242f0c749bd       no              veth4866e91 # this
lxdbr0          8000.00163e2afe62       no

Tản mạn qua Docker networking subsystem

Kiến trúc Docker networking subsystem dạng pluggable, có thể mở rộng sử dụng drivers. Mặc định, có những drivers sau: bridge, host, none, overlay, macvlan, ipvlan.

Trong bài viết này, chủ yếu nói đến bridge driver.

docker0 - default bridge network

# Start a second container
$ docker run --name nstest2 -it busybox
/ # ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
9: eth0@if10: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue
    link/ether 02:42:ac:11:00:03 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.3/16 brd 172.17.255.255 scope global eth0
       valid_lft forever preferred_lft forever
/ # ping nstest
ping: bad address 'nstest'
/ # ping 172.17.0.2
PING 172.17.0.2 (172.17.0.2): 56 data bytes
64 bytes from 172.17.0.2: seq=0 ttl=64 time=0.263 ms
^C
--- 172.17.0.2 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.263/0.263/0.263 ms

User-defined bridge network

$ docker network create mynet
ed7da300a506e6e9be68b8f69d1b857dd7cc8db2a9d51c4b85dba37976780293

$ docker network ls
NETWORK ID     NAME      DRIVER    SCOPE
935865833d99   bridge    bridge    local
f753aeafa1a1   host      host      local
ed7da300a506   mynet     bridge    local
6b0488ceccc8   none      null      local

$ docker network inspect mynet
[
    {
        "Name": "mynet",
        "Id": "ed7da300a506e6e9be68b8f69d1b857dd7cc8db2a9d51c4b85dba37976780293",
        "Created": "2023-12-12T09:37:34.358888862+07:00",
        "Scope": "local",
        "Driver": "bridge",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": {},
            "Config": [
                {
                    "Subnet": "172.18.0.0/16",
                    "Gateway": "172.18.0.1"
                }
            ]
        },
        "Internal": false,
        "Attachable": false,
        "Ingress": false,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": {},
        "Options": {},
        "Labels": {}
    }
]

# Check with ip addr show, it should show up as a network
# inteface on the host with the name br-<Network ID substring>
$ ip addr show
# ...
11: br-ed7da300a506: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
    link/ether 02:42:f8:36:9b:e0 brd ff:ff:ff:ff:ff:ff
    inet 172.18.0.1/16 brd 172.18.255.255 scope global br-ed7da300a506
       valid_lft forever preferred_lft forever

$ brctl show
bridge name     bridge id               STP enabled     interfaces
br-ed7da300a506 8000.0242f8369be0       no
docker0         8000.0242f0c749bd       no
lxdbr0          8000.00163e2afe62       no
$ docker run --name=web --network=mynet -d wbitt/network-multitool
e2d841b7a02916aa25a2fa5ceace0427ad7c0ccc967ad01fb5cc423460f0e990

$ docker run --name=db --network=mynet -e MYSQL_ROOT_PASSWORD=secret -d mysql
492ba8a9277d96876f0a605d9434723aed774978a40023e778a97815216754f0

$ docker ps
CONTAINER ID   IMAGE                     COMMAND                  CREATED              STATUS              PORTS                                  NAMES
e2d841b7a029   wbitt/network-multitool   "/bin/sh /docker/ent…"   15 seconds ago       Up 14 seconds       80/tcp, 443/tcp, 1180/tcp, 11443/tcp   web
492ba8a9277d   mysql                     "docker-entrypoint.s…"   About a minute ago   Up About a minute   3306/tcp, 33060/tcp                    db

# Ping each other
$ docker exec -it web bash
e2d841b7a029:/# ping -c 1 db
PING db (172.18.0.2) 56(84) bytes of data.
64 bytes from db.mynet (172.18.0.2): icmp_seq=1 ttl=64 time=0.096 ms

--- db ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.096/0.096/0.096/0.000 ms
e2d841b7a029:/# dig db

; <<>> DiG 9.18.16 <<>> db
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 57615
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;db.                            IN      A

;; ANSWER SECTION:
db.                     600     IN      A       172.18.0.2

;; Query time: 0 msec
;; SERVER: 127.0.0.11#53(127.0.0.11) (UDP)
;; WHEN: Tue Dec 12 03:18:39 UTC 2023
;; MSG SIZE  rcvd: 38

e2d841b7a029:/#

e2d841b7a029:/# cat /etc/resolv.conf
nameserver 127.0.0.11
options edns0 trust-ad ndots:0
e2d841b7a029:/# netstat -ntlup
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 127.0.0.11:45541        0.0.0.0:*               LISTEN      -
tcp        0      0 0.0.0.0:443             0.0.0.0:*               LISTEN      1/nginx: master pro
tcp        0      0 0.0.0.0:80              0.0.0.0:*               LISTEN      1/nginx: master pro
udp        0      0 127.0.0.11:42758        0.0.0.0:*                           -
$ docker run \
    --name tool \
    --network mynet \
    --cap-add=NET_ADMIN \
    --cap-add=NET_RAW \
    -it wbitt/network-multitool /bin/bash
ccd72fe225c8:/# dig -c db
;; Warning, ignoring invalid class db

; <<>> DiG 9.18.16 <<>> -c db
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 17778
;; flags: qr rd ra; QUERY: 1, ANSWER: 13, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 65494
;; QUESTION SECTION:
;.                              IN      NS

;; ANSWER SECTION:
.                       31133   IN      NS      d.root-servers.net.
.                       31133   IN      NS      l.root-servers.net.
.                       31133   IN      NS      k.root-servers.net.
.                       31133   IN      NS      i.root-servers.net.
.                       31133   IN      NS      j.root-servers.net.
.                       31133   IN      NS      e.root-servers.net.
.                       31133   IN      NS      h.root-servers.net.
.                       31133   IN      NS      g.root-servers.net.
.                       31133   IN      NS      a.root-servers.net.
.                       31133   IN      NS      f.root-servers.net.
.                       31133   IN      NS      c.root-servers.net.
.                       31133   IN      NS      b.root-servers.net.
.                       31133   IN      NS      m.root-servers.net.

;; Query time: 8 msec
;; SERVER: 127.0.0.11#53(127.0.0.11) (UDP)
;; WHEN: Tue Dec 12 03:20:07 UTC 2023
;; MSG SIZE  rcvd: 239

ccd72fe225c8:/# netstat -ntlup
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 127.0.0.11:39527        0.0.0.0:*               LISTEN      -
udp        0      0 127.0.0.11:35031        0.0.0.0:*                           -
ccd72fe225c8:/# iptables-nft-save
# Generated by iptables-nft-save v1.8.9 (nf_tables) on Tue Dec 12 03:20:27 2023
*nat
:PREROUTING ACCEPT [0:0]
:INPUT ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:POSTROUTING ACCEPT [0:0]
:DOCKER_OUTPUT - [0:0]
:DOCKER_POSTROUTING - [0:0]
-A OUTPUT -d 127.0.0.11/32 -j DOCKER_OUTPUT
-A POSTROUTING -d 127.0.0.11/32 -j DOCKER_POSTROUTING
# Queries for DNS:
-A DOCKER_OUTPUT -d 127.0.0.11/32 -p tcp -m tcp --dport 53 -j DNAT --to-destination 127.0.0.11:39527
-A DOCKER_OUTPUT -d 127.0.0.11/32 -p udp -m udp --dport 53 -j DNAT --to-destination 127.0.0.11:35031
# Response from DNS:
-A DOCKER_POSTROUTING -s 127.0.0.11/32 -p tcp -m tcp --sport 39527 -j SNAT --to-source :53
-A DOCKER_POSTROUTING -s 127.0.0.11/32 -p udp -m udp --sport 35031 -j SNAT --to-source :53
COMMIT
# Completed on Tue Dec 12 03:20:27 2023

Lưu ý: Port trên hình có thể khác với thực tế.

	case container.HostConfig.NetworkMode.IsUserDefined():
		// The container uses a user-defined network. We use the embedded DNS
		// server for container name resolution and to act as a DNS forwarder
		// for external DNS resolution.
		// We parse the DNS server(s) that are defined in /etc/resolv.conf on
		// the host, which may be a local DNS server (for example, if DNSMasq or
		// systemd-resolvd are in use). The embedded DNS server forwards DNS
		// resolution to the DNS server configured on the host, which in itself
		// may act as a forwarder for external DNS servers.
		// If systemd-resolvd is used, the "upstream" DNS servers can be found in
		// /run/systemd/resolve/resolv.conf. We do not query those DNS servers
		// directly, as they can be dynamically reconfigured.
		*sboxOptions = append(
			*sboxOptions,
			libnetwork.OptionOriginResolvConfPath("/etc/resolv.conf"),
		)
	default:

Một số command hay ho

$ docker run --name busybox \
                 --pid container:db \
                 --rm -it busybox /bin/sh
/ # ps aux
PID   USER     TIME  COMMAND
    1 999       0:25 mysqld
  237 root      0:00 /bin/sh
  243 root      0:00 ps aux
/ # ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
48: eth0@if49: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue
    link/ether 02:42:ac:11:00:02 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.2/16 brd 172.17.255.255 scope global eth0
       valid_lft forever preferred_lft forever
docker run --name busybox \
                 --pid container:db \
                 --network container:db --rm -it busybox /bin/sh
/ # ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
38: eth0@if39: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue
    link/ether 02:42:ac:12:00:02 brd ff:ff:ff:ff:ff:ff
    inet 172.18.0.2/16 brd 172.18.255.255 scope global eth0
       valid_lft forever preferred_lft forever
/ # ps aux
PID   USER     TIME  COMMAND
    1 999       0:26 mysqld
  245 root      0:00 /bin/sh
  252 root      0:00 ps aux
/ #