Cannot make a curl to pod's IP probably due to interface mismatch #10233

TuanTranBPK · 2025-04-15T21:23:37Z

I install a K8S cluster using kubeadm

sudo kubeadm init --pod-network-cidr=10.42.0.0/16

kubectl taint nodes --all node-role.kubernetes.io/control-plane-

kubectl create -f https://raw.githubusercontent.com/projectcalico/calico/v3.27.2/manifests/tigera-operator.yaml

kubectl apply -f calico-config.yaml

calico-config.txt

Then install nginx pod

Expected Behavior

From the node, I should be able to make a curl to nginx's pod IP

Current Behavior

The curl is failed
[trant@eam32 calico]$ curl -v 10.42.53.71

Trying 10.42.53.71:80...
connect to 10.42.53.71 port 80 failed: No route to host
Failed to connect to 10.42.53.71 port 80: No route to host
Closing connection 0
curl: (7) Failed to connect to 10.42.53.71 port 80: No route to host

The confusing part is when I perform a tcpdump on the cali7072c88a915 interface, I see ARP message with different IP address/interface (172.30.2.1) than the IP address/interface shown on the node (10.12.178.104)

tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on cali7072c88a915, link-type EN10MB (Ethernet), snapshot length 262144 bytes
22:55:37.324031 ARP, Request who-has 10.42.53.71 tell 172.30.2.1, length 28
22:55:38.368903 ARP, Request who-has 10.42.53.71 tell 172.30.2.1, length 28
22:55:39.392902 ARP, Request who-has 10.42.53.71 tell 172.30.2.1, length 28
22:55:40.417003 ARP, Request who-has 10.42.53.71 tell 172.30.2.1, length 28
22:55:41.440908 ARP, Request who-has 10.42.53.71 tell 172.30.2.1, length 28
22:55:42.464908 ARP, Request who-has 10.42.53.71 tell 172.30.2.1, length 28

kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
eam32 Ready control-plane 28m v1.30.5 10.12.178.104 Rocky Linux 9.5 (Blue Onyx) 5.14.0-503.19.1.el9_5.x86_64 cri-o://1.22.5

Possible Solution

Don't know

Steps to Reproduce (for bugs)

Se above

Context

Your Environment

The network/interface configuration is in the attachment

ip_a.txt

Calico log:

calico-node.log

Calico version: version v3.27.2
Calico dataplane (iptables, windows etc.)
Orchestrator version (e.g. kubernetes, mesos, rkt):
Operating System and version: Rocky Linux 9.5
Link to your project (optional):

The text was updated successfully, but these errors were encountered:

tomastigera · 2025-04-15T22:10:39Z

You have a whole bunch of interfaces on your node, k8s may have picked one of the ifaces as the main and use its IP as the node's internal IP, calico may have auto detected another IP. Check out https://docs.tigera.io/calico/latest/networking/ipam/ip-autodetection

Could you provide ip route output?

TuanTranBPK · 2025-04-16T07:14:00Z

I tried IP autodection modes (can-reach and interface) but it doesn't work. My ip route is below, 10.42.53.71 is nginx pod IP
[trant@eam32 ~]$ ip r
default via 10.12.178.254 dev ens3f0np0.99 proto dhcp src 10.12.178.104 metric 401
10.12.178.0/23 dev ens3f0np0.99 proto kernel scope link src 10.12.178.104 metric 401
blackhole 10.42.53.64/26 proto 80
10.42.53.65 dev calica8461fef58 scope link
10.42.53.66 dev calib250ffe937d scope link
10.42.53.67 dev cali2f0666e3385 scope link
10.42.53.68 dev cali457d2a7ae14 scope link
10.42.53.69 dev calibcd5f644fa1 scope link
10.42.53.70 dev cali20b202672c4 scope link
10.42.53.71 dev cali7072c88a915 scope link
10.76.77.0/24 dev ens3f0np0.700 proto kernel scope link src 10.76.77.18 metric 402
172.16.0.0/12 proto static src 172.31.0.1
nexthop via 172.30.1.250 dev ens3f0np0.401 weight 1
nexthop via 172.30.2.250 dev enp196s0f1.402 weight 1
172.30.1.0/24 via 172.30.1.1 dev ens3f0np0.401 proto static metric 400
172.30.1.0/24 dev ens3f0np0.401 proto kernel scope link src 172.30.1.1 metric 403
172.30.2.0/24 dev enp196s0f1.402 proto kernel scope link src 172.30.2.1 metric 400
192.9.0.0/16 dev ens3f0np0.3 proto kernel scope link src 192.9.110.18 metric 404

TuanTranBPK · 2025-04-16T07:37:01Z

Just add more information to narrow down the problem. The same installation procedure above works on the K8S cluster with a single VM having several interfaces. The problem described in this issue is with a physical machine with multiple network cards

ip-a-single-vm.txt

TuanTranBPK · 2025-04-22T13:43:06Z

@tomastigera Anything else do you need for your investigation ? It seems to me that this is a bug.

Thanks,
Tuan

tomastigera · 2025-05-06T16:46:25Z

Your node from the logs above picked startup/autodetection_methods.go 103: Using autodetected IPv4 address on interface ens3f0np0.3: 192.9.110.18/16

The ip r above is from the node where the nginx server is which does not really say much about whether there is a route from the client. However, there is no route from this node to a similar CIDR. That is weird given that the IPPool is 10.42.0.0/16 Created default IPv4 pool (10.42.0.0/16) with NAT outgoing true. IPIP mode: Never, VXLAN mode: Always, DisableBGPExport: false

It also says VXLAN always, but I do not see any vxlan device. I do see a vxlan device in the list of device of the VM.

However I do see 2025-04-15 20:34:45.475 [INFO][54] felix/vxlan_mgr.go 742: Assigning address to VXLAN device address=10.42.53.64/32 ipVersion=0x4 the device being created in the logs.

So I think the device is missing and the routes to pods on other nodes via the vxlan device are missing too. And that is the problem. But I cannot tell, why they are missing 🤔

tomastigera · 2025-05-06T16:48:15Z

Have you tried a newer version than v3.27.2?

TuanTranBPK · 2025-05-07T08:41:59Z

I tried with the latest version v3.30 (https://docs.tigera.io/calico/latest/getting-started/kubernetes/self-managed-onprem/onpremises) but the problem is always there.

kubectl create -f https://raw.githubusercontent.com/projectcalico/calico/v3.30.0/manifests/operator-crds.yaml

kubectl create -f https://raw.githubusercontent.com/projectcalico/calico/v3.30.0/manifests/tigera-operator.yaml

I modified only the CIDR field in the example file https://raw.githubusercontent.com/projectcalico/calico/v3.30.0/manifests/custom-resources.yaml
kubectl apply -f custom-resources.yaml

calico-node.log
tigera-operator.log

ip r
default via 10.12.178.254 dev ens3f0np0.99 proto dhcp src 10.12.178.104 metric 401
10.12.178.0/23 dev ens3f0np0.99 proto kernel scope link src 10.12.178.104 metric 401
blackhole 10.42.53.64/26 proto 80
10.42.53.65 dev cali6028fd2d0bb scope link
10.42.53.66 dev cali3d5af4bc40d scope link
10.42.53.67 dev cali1f4489ffc48 scope link
10.42.53.68 dev cali56aa98abf8e scope link
10.42.53.69 dev calie084588d789 scope link
10.42.53.70 dev cali98709610c8f scope link
10.42.53.71 dev cali62b602db861 scope link
10.42.53.72 dev cali02c87c8e7c0 scope link
10.42.53.73 dev cali034e5efd019 scope link
10.76.77.0/24 dev ens3f0np0.700 proto kernel scope link src 10.76.77.18 metric 402
172.16.0.0/12 proto static src 172.31.0.1
nexthop via 172.30.1.250 dev ens3f0np0.401 weight 1
nexthop via 172.30.2.250 dev enp196s0f1.402 weight 1
172.30.1.0/24 via 172.30.1.1 dev ens3f0np0.401 proto static metric 400
172.30.1.0/24 dev ens3f0np0.401 proto kernel scope link src 172.30.1.1 metric 403
172.30.2.0/24 dev enp196s0f1.402 proto kernel scope link src 172.30.2.1 metric 400
192.9.0.0/16 dev ens3f0np0.3 proto kernel scope link src 192.9.110.18 metric 404

Anything else do you want to look at ?

coutinhop · 2025-05-20T18:29:24Z

@TuanTranBPK I see this line in your calico-node log:

2025-05-07 08:29:05.444 [INFO][9] startup/autodetection_methods.go 103: Using autodetected IPv4 address on interface ens3f0np0.3: 192.9.110.18/16

And it seems like the VXLAN device does get that interface ens3f0np0.3 for its parent:

2025-05-07 08:29:07.019 [INFO][83] felix/vxlan_mgr.go 822: Assigning address to VXLAN device address=10.42.53.64/32 ipVersion=0x4
2025-05-07 08:29:07.037 [INFO][83] felix/vxlan_mgr.go 627: VXLAN device parent changed from "" to "ens3f0np0.3" ipVersion=0x4

Do you know if that interface choice is "wrong"? (not the one connected to the rest of the cluster or something to that effect?)

You mentioned this

I tried IP autodection modes (can-reach and interface) but it doesn't work.

But what were the results when trying these modes? Did you try using CIDR(s) as well (https://docs.tigera.io/calico/latest/networking/ipam/ip-autodetection#change-the-autodetection-method)?

tomastigera added the kind/support label Apr 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Cannot make a curl to pod's IP probably due to interface mismatch #10233

Cannot make a curl to pod's IP probably due to interface mismatch #10233

TuanTranBPK commented Apr 15, 2025 •

edited by tomastigera

Loading

tomastigera commented Apr 15, 2025

Uh oh!

TuanTranBPK commented Apr 16, 2025

Uh oh!

TuanTranBPK commented Apr 16, 2025

Uh oh!

TuanTranBPK commented Apr 22, 2025

Uh oh!

tomastigera commented May 6, 2025

Uh oh!

tomastigera commented May 6, 2025

Uh oh!

TuanTranBPK commented May 7, 2025

Uh oh!

coutinhop commented May 20, 2025

Uh oh!

Cannot make a curl to pod's IP probably due to interface mismatch #10233

Cannot make a curl to pod's IP probably due to interface mismatch #10233

Comments

TuanTranBPK commented Apr 15, 2025 • edited by tomastigera Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Expected Behavior

Current Behavior

Possible Solution

Steps to Reproduce (for bugs)

Context

Your Environment

tomastigera commented Apr 15, 2025

Uh oh!

TuanTranBPK commented Apr 16, 2025

Uh oh!

TuanTranBPK commented Apr 16, 2025

Uh oh!

TuanTranBPK commented Apr 22, 2025

Uh oh!

tomastigera commented May 6, 2025

Uh oh!

tomastigera commented May 6, 2025

Uh oh!

TuanTranBPK commented May 7, 2025

Uh oh!

coutinhop commented May 20, 2025

Uh oh!

TuanTranBPK commented Apr 15, 2025 •

edited by tomastigera

Loading