Skip to content
This repository has been archived by the owner on Dec 23, 2020. It is now read-only.

Can't connect (self-assigns IP after 5') #17

Open
NdK73 opened this issue Jun 2, 2020 · 33 comments
Open

Can't connect (self-assigns IP after 5') #17

NdK73 opened this issue Jun 2, 2020 · 33 comments

Comments

@NdK73
Copy link

NdK73 commented Jun 2, 2020

Hello.

Following directions in other issues, I removed core 2.7.1 and installed your 0.0.3 .
The code I'm compiling is:

#include <ESP8266WiFi.h>

#include <W5500lwIP.h>

// Using GPIO2 as CS
Wiznet5500lwIP eth(2);

void setup ()
{
  Serial.begin(115200);

  Serial.println("");
//  WiFi.mode(WIFI_OFF);

  SPI.begin();
//  SPI.setBitOrder(MSBFIRST);
//  SPI.setDataMode(SPI_MODE0);
  // Slow down because long (~20cm) wires and breadboard
  SPI.setFrequency(10000000); // Works up to 80M

  eth.setDefault(); // use ethernet for default route
  bool r=eth.begin(); // default mtu & mac address
  if(!r) {
    Serial.println("Error initializing eth");
    while(true) delay(1000);
  }
}

void loop ()
{
  delay(5000);
  if(eth.connected()) {
    Serial.print("My IP address: ");
    Serial.println(eth.localIP());
  } else {
    Serial.println("Still unconnected");
  }

}

It prints "Still unconnected" for about 5 minutes, then self-assigns 169.254.205.199 .
The W5500 board is connected correctly (I think: if I try swapping MISO/MOSI, or unconnecting SCLK or CS I get "Error initializing eth").
I left SPI settings commented after trying 'em and noticing they didn't change anything.
I don't see DHCP requests on my LAN.

@d-a-v
Copy link
Owner

d-a-v commented Jun 2, 2020

I just tried your sketch and obtained an IP address:

Still unconnected
My IP address: 10.0.1.118
My IP address: 10.0.1.118
My IP address: 10.0.1.118

I also tried with the UDP sketch from the other issue and it works:

SDK:2.2.2-dev(38a443e)/Core:2.7.1-99-g59fe44e7=20701099/lwIP:STABLE-2_1_2_RELEASE/glue:1.2-33-g3d57f60/BearSSL:5c771be

............My IP address: 10.0.1.118
:urn 7
Received packet of size 7 from 10.0.1.8:53569
    (to 10.0.1.118:8888, free heap = 40360 B)
:urd 7, 7, 0
Contents:
cdscsd

with this sketch which is a mix of the Udp demo and your sketch:

#include <ESP8266WiFi.h>
#include <WiFiUdp.h>
#include <W5500lwIP.h>

// Using GPIO16 as CS
Wiznet5500lwIP eth(16);
WiFiUDP Udp;

unsigned int localPort = 8888;      // local port to listen on

// buffers for receiving and sending data
char packetBuffer[UDP_TX_PACKET_MAX_SIZE + 1]; //buffer to hold incoming packet,
char  ReplyBuffer[] = "acknowledged\r\n";       // a string to send back

void setup ()
{
  Serial.begin(115200);

  Serial.println("");
//  WiFi.mode(WIFI_OFF);

  SPI.begin();
//  SPI.setBitOrder(MSBFIRST);
//  SPI.setDataMode(SPI_MODE0);
  // Slow down because long (~20cm) wires and breadboard
  SPI.setFrequency(10000000); // Works up to 80M

  eth.setDefault(); // use ethernet for default route
  bool r=eth.begin(); // default mtu & mac address
  if(!r) {
    Serial.println("Error initializing eth");
    while(true) delay(1000);
  }

  // This loop is odd, do not refer to it as a good example.
  // Better example will be provided,
  // why: `delay()` requires changes like in
  // https://github.com/esp8266/Arduino/pull/6212
  // to use the usual loop waiting for dhcp response
  while (!eth.connected())
  {
    Serial.print(".");
    for (int i = 0; i < 500; i++)
        if (eth.connected())
            break;
        else
            delay(1);
  }

  Serial.print("My IP address: ");
  Serial.println(eth.localIP());

  Udp.begin(localPort);
}

void loop ()
{

  // if there's data available, read a packet
  int packetSize = Udp.parsePacket();
  if (packetSize) {
    Serial.printf("Received packet of size %d from %s:%d\n    (to %s:%d, free heap = %d B)\n",
                  packetSize,
                  Udp.remoteIP().toString().c_str(), Udp.remotePort(),
                  Udp.destinationIP().toString().c_str(), Udp.localPort(),
                  ESP.getFreeHeap());

    // read the packet into packetBufffer
    int n = Udp.read(packetBuffer, UDP_TX_PACKET_MAX_SIZE);
    packetBuffer[n] = 0;
    Serial.println("Contents:");
    Serial.println(packetBuffer);

    // send a reply, to the IP address and port that sent us the packet we received
    Udp.beginPacket(Udp.remoteIP(), Udp.remotePort());
    Udp.write(ReplyBuffer);
    Udp.endPacket();
  }

}

/*
  test (shell/netcat):
  --------------------
	  nc -u 192.168.esp.address 8888
*/

@NdK73
Can you run tcpdump on your dhcp server ?
Or Wireshark on your PC, you should see the dhcp request on ethernet.
You can also try to use netdump: Netdump (you need to import the lib from there).
(a better Netdump will be included in the arduino core but it is not yet here)

@NdK73
Copy link
Author

NdK73 commented Jun 2, 2020

Your sketch seems to work (partially) after changing CS from GPIO2 to GPIO16. Tks.
Now I get:

22:40:25.642 -> ....read error?
22:40:27.300 -> 
22:40:27.699 -> My IP address: 192.168.178.38
22:40:28.829 -> read error?
22:40:28.829 -> 

Then a lot of "read error?" messages. But at least this time I saw the DHCP request (tcpdump from my PC) and the IP gets assigned.

UDP rx seems to work:

22:44:28.319 -> Received packet of size 13 from 192.168.178.2:41859
22:44:28.319 ->     (to 192.168.178.38:8888, free heap = 42384 B)
22:44:28.319 -> Contents:
22:44:28.319 -> test message
22:44:28.319 -> 

I'll have to test multicast too (my domotic protocol uses it) as soon as I understand what was wrong in my sketch and fix the "read error?" messages.

PS: thanks for the fast answer and all the work you're doing!

@NdK73
Copy link
Author

NdK73 commented Jun 2, 2020

It stops working (does not get address, but W5500 is recognized by begin() ) if I change CS pin. Tried with GPIO2 and GPIO0 with no luck.
Tested SPI from 1MHz up to 40MHz and the "read error?" messages are still there :(

@d-a-v
Copy link
Owner

d-a-v commented Jun 2, 2020

About read error message,
can you edit https://github.com/d-a-v/Arduino/blob/ethernet/cores/esp8266/lwIPIntfDev.h#L306 and replace it with

Serial.println("read error len=%d tot_len=%d?\r\n",(int)len,(int)tot_len);

(look for lwIPIntfDev.h on your computer)

@NdK73
Copy link
Author

NdK73 commented Jun 2, 2020

Done (printf instead of println). It now prints:

23:27:56.647 -> ....read error len=0 tot_len=60?
23:27:58.410 -> My IP address: 192.168.178.38
23:27:59.907 -> read error len=0 tot_len=60?

@d-a-v
Copy link
Owner

d-a-v commented Jun 2, 2020

I have three eth modules (w5100, w5500, enc28j60). Every one is heating a lot while working.
Can you double check your power supply ?
Try another one ?
I will check the code to try and understand why it happens (what condition or if the cause can be printed)

@d-a-v
Copy link
Owner

d-a-v commented Jun 2, 2020

You can try to replace this line by

if (1) // if ((buffer[0] & 0x01) || memcmp(&buffer[0], _mac_address, 6) == 0)

https://github.com/d-a-v/Arduino/blob/fef264e98589d8ac62e859bfcc978e919e7e6c95/libraries/lwIP_w5500/src/utility/w5500.cpp#L389

Filtering will be anyway done by lwIP. If this removes the issue for you, then I'll check the other drivers.

@NdK73
Copy link
Author

NdK73 commented Jun 2, 2020

My module is not heating abnormally. I only connected 5V line, not the 3v3 one. And it's from the same USB that's powering the D1mini.

With line 389 replaced as given the messages are gone.
Tks again. I hope it gets included soon in mainline core (possibly 2.7.2?).

@d-a-v
Copy link
Owner

d-a-v commented Jun 2, 2020

No hope for 2.7.2. It will be 3.0.0 or git head after 2.7.2 is out.
I'll keep the alpha or gamma version updated though (with a better display of what's in them).
No more displayed errors, but everything fine, meaning no transmission error ?
That's great, Thanks for testing and reporting !

d-a-v added a commit to d-a-v/Arduino that referenced this issue Jun 2, 2020
@NdK73
Copy link
Author

NdK73 commented Jun 3, 2020

Then I'll keep gamma installed :)
I didn't do much more testing, but there are no more rx errors. Still need to test multicast.
A strange thing I just noticed: this morning it was no more pingable. A quick reset "fixed" it, but it shouldn't be needed. Possibly a problem with DHCP renew? I'll keep pinging it from time to time during the day just to check.

@NdK73
Copy link
Author

NdK73 commented Jun 3, 2020

Problem confirmed: after 3h it's not answering pings any more. Seems unrelated to DHCP renew, since it's set to 10 days.
No messages printed on serial console.

Tks again.

@d-a-v
Copy link
Owner

d-a-v commented Jun 3, 2020

I would need to be able to reproduce. What is the running sketch ?

@NdK73
Copy link
Author

NdK73 commented Jun 3, 2020

The one you posted above. Only change: the SPI speed (boosted to 40MHz... Going to try down to 10).

@NdK73
Copy link
Author

NdK73 commented Jun 3, 2020

Already offline at 10MHz :( 2h or less.

@d-a-v
Copy link
Owner

d-a-v commented Jun 3, 2020

Ah. I have another one - a multiinterface mDNS test - up and running for several hours.

Can you compile with every debug option enabled, and add a loop that prints hello every sec or so,

// in loop():
static unsigned long lastTime = 0;
if (millis() - lastTime > 1000) { lastTime = millis(); Serial.println("hearbeat"); }

@NdK73
Copy link
Author

NdK73 commented Jun 3, 2020

I just completed a test: I left the sketch running, pinging it from the reset.

64 bytes from 192.168.178.38: icmp_seq=1049 ttl=255 time=1.31 ms
64 bytes from 192.168.178.38: icmp_seq=1050 ttl=255 time=1.24 ms
From 192.168.178.2 icmp_seq=1078 Destination Host Unreachable
From 192.168.178.2 icmp_seq=1079 Destination Host Unreachable
From 192.168.178.2 icmp_seq=1080 Destination Host Unreachable

Re: your test, it just started (w/ a slight mod: I'm printing lastTime value before "hearbeat"):

16:10:37.554 -> SDK:2.2.2-dev(38a443e)/Core:2.7.1-205-gcf4673c5=20701205/lwIP:STABLE-2_1_2_RELEASE/glue:1.2-33-g3d57f60/BearSSL:5c771be
16:10:37.582 -> 
16:10:37.582 -> ....My IP address: 192.168.178.38
16:10:39.543 -> 2049 hearbeat
16:10:40.541 -> 3050 hearbeat
16:10:41.538 -> 4051 hearbeat
16:10:42.569 -> 5052 hearbeat
16:10:43.567 -> 6053 hearbeat
16:10:44.565 -> 7054 hearbeat
16:10:45.562 -> 8055 hearbeat
16:10:46.558 -> 9056 hearbeat
16:10:47.555 -> 10057 hearbeat
16:10:48.020 -> wifi evt: 7
16:10:48.552 -> 11058 hearbeat

That "wifi evt: 7" makes me think... Could it be that it's connecting BOTH via eth and WiFi? Once it ends I'll try re-enabling Wifi.mode(WIFI_OFF)...
BTW couldn't it be better to slightly modify MAC address instead of copying it from WiFi? IIUC, when least significant bits of first octect are '10', it means "locally administerd (1)" and "unicast (0)".
mac[0]=(mac[0]&0xfe)|0x02;

@d-a-v
Copy link
Owner

d-a-v commented Jun 3, 2020

If WiFi is not turned off, dhcp might stop at some point and put interfaces down. It shouldn't put eth down anyway. I should try and reproduce with that.

Yes trying to disable WiFi like you propose is interesting.

Mac address is copied then slightly modified for the ethernet case.

edit: I'm open for any proposal for the mac update given that we can have more that one external interface.

@NdK73
Copy link
Author

NdK73 commented Jun 3, 2020

I'd combine your method with the masking I suggested, so it's clear it's not an official (IEEE-assigned) address. Anyway ESP8266 doesn't have enough free pins to handle many interfaces, unless it's somehow possible to use the flash SPI bus (that would free 3 more pins) like display "overlay" mode.

The currently running test just passed 1024th ping with no issues (up to 1324). So it's quite probably wifi-related. Or a race condition with wifi code.

@NdK73
Copy link
Author

NdK73 commented Jun 3, 2020

Argh. Nope. Died at ping 2096 :(

@d-a-v
Copy link
Owner

d-a-v commented Jun 3, 2020

Anyway ESP8266 doesn't have enough free pins to handle many interfaces

miso, mosi and clock are shared, then an additional CS per spi device is needed.

I'd combine your method with the masking I suggested, so it's clear it's not an official (IEEE-assigned) address.

Can you think of a valid mac address based on the STA one and a non zero number (between 1 and 8) ?

Argh. Nope. Died at ping 2096 :(

Anything relevant on console ?

@NdK73
Copy link
Author

NdK73 commented Jun 3, 2020

miso, mosi and clock are shared, then an additional CS per spi device is needed.

Yup. But the other pins are already overloaded: RX/TX for serial, 4/5 for I2C, 0/2/15 should be usable with some tricks, 16 is OK if you don't need deep sleep. An alternative could be to use an I2C expander, but it gets really slow :)
If you don't need any other IO you can probably connect up to 8 eth interfaces (or other SPI devices). By using overlap mode (SPI.pins(6, 7, 8, 0) ) you're limited to a single SPI device (must use GPIO0 as CS, unless you add external circuitry to 'or' it w/ another GPIO) but the previously used MISO/MOSI/SCLK are freed (you're using the flash SPI interface). Using the "or" trick that would mean a total of 10 interfaces...

Can you think of a valid mac address based on the STA one and a non zero number (between 1 and 8) ?

Just add

_macAddress[0]=(_macAddress[0]&0xFE)|0x02;

after

_macAddress[3] += _netif.num;

Anything relevant on console ?

Nothing at all on console: hb messages continue but the network is completely not responsive (does not receive UDP packets).

d-a-v added a commit to d-a-v/Arduino that referenced this issue Jun 3, 2020
@d-a-v
Copy link
Owner

d-a-v commented Jun 3, 2020

Just add

Thanks, added

I restarted my run to watch heap:

  static periodicFastMs showNow(10000);
  if (showNow)
  {
    Serial.printf("heap= %zd B - up= %zd mn\n", ESP.getMaxFreeBlockSize(), millis()/60000);
  }

I currently have

heap= 46352 B - up= 24 mn

@NdK73
Copy link
Author

NdK73 commented Jun 3, 2020

I don't have periodicFastMs, but changed the hb loop to use that printf every 5s.
Seems there's no leak (after receiving first UDP packet).

@d-a-v
Copy link
Owner

d-a-v commented Jun 3, 2020

Sorry this is missing:

#include <PolledTimeout.h>
using namespace esp8266::polledTimeout;

...

heap = 46352 B - up= 73 mn
heap = 46352 B - up= 74 mn

edit later:

heap = 45432 B - up= 656 mn
heap = 45432 B - up= 656 mn
heap = 45432 B - up= 657 mn
heap = 45432 B - up= 657 mn

played with mDNS and ethernet or AP ("heap" is maxBlock)

heap = 45432 B - up= 658 mn
heap = 45432 B - up= 658 mn
heap = 45432 B - up= 659 mn
heap = 45008 B - up= 659 mn
heap = 45008 B - up= 659 mn
heap = 45008 B - up= 659 mn

@NdK73
Copy link
Author

NdK73 commented Jun 4, 2020

Now it ran overnight w/o problems. ARGH! Just changing a printf lets it run???

But, returning to original issue, if I just change CS line to GPIO2 or GPIO0, it stops working. How can it be?

@d-a-v
Copy link
Owner

d-a-v commented Jun 4, 2020

GPIO0 and 2 must have pullups in order the esp can boot. Do you think they can interfere with a CS output pin ?

@NdK73
Copy link
Author

NdK73 commented Jun 4, 2020

No. I'm using a D1mini that already have the needed pullup. And there's another pullup on the W5500 board (that's the reason a buffer transistor is needed when using GPIO15 as CS: it requires to be low at boot). Having two pullups in parallel just makes it a stronger pullup, but the line can still be pulled low w/o problems (I often have 2 or even 3 10k pullups on I2C busses, especially for longer ones).
Moreover, the ESP boots and the sketch starts. It "just" can't get IP, but it seems to recognize W5500.

@d-a-v
Copy link
Owner

d-a-v commented Jun 4, 2020

I can't say much about hardware issues. I am also using a d1 mini. I shall try to use these gpio as CS.
With stronger pullup, maybe cs volatge is not low enough or around the limit for this particular chip (more consumption once it's on/enabled dealing with packet, weak 3.3v regulator on d1 powering also the w5500 (mine is heating like I said) ...)
Can you check the voltage with an oscilloscope ?

@NdK73
Copy link
Author

NdK73 commented Jun 4, 2020

Do not overload regulators! Both boards have their own. Just connect the 5V on the W5500, not the 3v3. This way it's powered directly from the USB and D1's regulator won't overheat.

@d-a-v
Copy link
Owner

d-a-v commented Jun 4, 2020

That's not what I did but you're right (my W5500 5v is not connected, I use the d1mini's 3.3v for both).

Do you think it is worth testing CS due to parallel pullups ?

@NdK73
Copy link
Author

NdK73 commented Jun 4, 2020

That's not what I did but you're right (my W5500 5v is not connected, I use the d1mini's 3.3v for both).

That way you're probably overloading it, or at least you're very near the limits.

Do you think it is worth testing CS due to parallel pullups ?

I just tested it, and pulses on GPIO0 are quite good: from 3.1V down to 0.25V with a bit of overshoot towards 0V. Clean fronts. I'd expect no issues HW-wise. And actually W5500 gets recognized (no error at begin() ).

@NdK73
Copy link
Author

NdK73 commented Jun 6, 2020

Another data point: I've let it run (w/ self-assigned IP) for quite some time (2737 minutes), then reset it and it got an address! CS currently on GPIO0.
Just pressed and released reset button. No recompile, reflash, no nothing. ARGH!

@NdK73
Copy link
Author

NdK73 commented Jun 16, 2020

I added a "pinger" that, after 5 failed pings, tries to reset the w5500.
And I get this stack trace from a WDT reset:

0x40215f69: netif_add_LWIP2 at /home/gauchard/dev/esp8266/esp8266/tools/sdk/lwip2/builder/lwip2-src/src/core/netif.c line 375
0x40201020: LwipIntfDev ::netif_init_s(netif*) at /home/ndk/.arduino15/packages/esp8266/hardware/esp8266/0.0.3/cores/esp8266/lwIPIntfDev.h line 219
0x40202c61: Wiznet5500::wizchip_read(unsigned char, unsigned short) at /home/ndk/.arduino15/packages/esp8266/hardware/esp8266/0.0.3/libraries/lwIP_w5500/src/utility/w5500.cpp line 55
0x402012d1: LwipIntfDev ::begin(unsigned char const*, unsigned short) at /home/ndk/.arduino15/packages/esp8266/hardware/esp8266/0.0.3/cores/esp8266/lwIPIntfDev.h line 149
0x40212590: ethernet_input_LWIP2 at /home/gauchard/dev/esp8266/esp8266/tools/sdk/lwip2/builder/lwip2-src/src/netif/ethernet.c line 82
0x40100331: digitalWrite at /home/ndk/.arduino15/packages/esp8266/hardware/esp8266/0.0.3/cores/esp8266/core_esp8266_wiring_digital.cpp line 86
0x40100331: digitalWrite at /home/ndk/.arduino15/packages/esp8266/hardware/esp8266/0.0.3/cores/esp8266/core_esp8266_wiring_digital.cpp line 86
0x40202c61: Wiznet5500::wizchip_read(unsigned char, unsigned short) at /home/ndk/.arduino15/packages/esp8266/hardware/esp8266/0.0.3/libraries/lwIP_w5500/src/utility/w5500.cpp line 55
0x402030ac: Wiznet5500::end() at /home/ndk/.arduino15/packages/esp8266/hardware/esp8266/0.0.3/libraries/lwIP_w5500/src/utility/w5500.cpp line 332 (discriminator 1)
0x402013af: resetW5500() at /home/ndk/Arduino/test_eth_disp/test_eth_disp.ino line 30
0x40206928: Print::println(char const*) at /home/ndk/.arduino15/packages/esp8266/hardware/esp8266/0.0.3/cores/esp8266/Print.cpp line 198
0x402016ee: loop at /home/ndk/Arduino/test_eth_disp/test_eth_disp.ino line 141
0x402014d0: std::_Function_handler  ::begin(unsigned char const*, unsigned short)::{lambda()#1}>::_M_invoke(std::_Any_data const&) at /home/ndk/.arduino15/packages/esp8266/tools/xtensa-lx106-elf-gcc/2.5.0-4-b40a506/xtensa-lx106-elf/include/c++/4.8.2/functional line 2058
0x4020a618: DhcpServer::add_offer_options(unsigned char*) at ?? line ?
0x401000d5: std::function ::operator()() const at /home/ndk/.arduino15/packages/esp8266/tools/xtensa-lx106-elf-gcc/2.5.0-4-b40a506/xtensa-lx106-elf/include/c++/4.8.2/functional line 2465
0x40206cb0: run_scheduled_recurrent_functions() at /home/ndk/.arduino15/packages/esp8266/hardware/esp8266/0.0.3/cores/esp8266/Schedule.cpp line 223
0x402010e8: std::_Function_base::_Base_manager ::_M_manager(std::_Any_data&, std::_Function_base::_Base_manager  const&, std::_Manager_operation) at /home/ndk/.arduino15/packages/esp8266/tools/xtensa-lx106-elf-gcc/2.5.0-4-b40a506/xtensa-lx106-elf/include/c++/4.8.2/functional line 1931
0x401001c0: ets_post at /home/ndk/.arduino15/packages/esp8266/hardware/esp8266/0.0.3/cores/esp8266/core_esp8266_main.cpp line 177
0x40207980: loop_wrapper() at /home/ndk/.arduino15/packages/esp8266/hardware/esp8266/0.0.3/cores/esp8266/core_esp8266_main.cpp line 197
0x401013fd: cont_wrapper at /home/ndk/.arduino15/packages/esp8266/hardware/esp8266/0.0.3/cores/esp8266/cont.S line 81

The resetter is like the setup() code, plus eth.end():

void resetW5500()
{
  eth.end();
  if(eth.begin()) {
    Serial.println("DHCP");
    while (!eth.connected())
    {
      Serial.print(".");
      for (int i = 0; i < 500; i++)
          if (eth.connected())
              break;
          else
              delay(1);
    }
  
    char addr[]="Got                ";
    Serial.print("My IP address: ");
    Serial.println(eth.localIP());
  } else {
    Serial.println("Eth reset failed");
  }
}

IIUC the WDT triggers inside eth.end().

d-a-v added a commit to esp8266/Arduino that referenced this issue Dec 22, 2020
This commit adds W5500 W5100 and ENC28j60 drivers from @njh with credits
They are available in libraries/
An example is added in W5500 examples directory

plus:
* Extract dhcp server from lwip2 and add it to the core as a class.
  It must always be present, it is linked and can be called by fw on boot.
  So it cannot be stored in a library.
* ethernet: static or dhcp works
* PPPServer: example
* bring WiFi.config() to the lwIP generic interface (argument reorder common function)
* move hostname() from WiFI-STA to generic interface
* remove non readable characters from dhcp-server comments
* dhcp-server: magic_cookie is part of bootp rfc
* fixes from d-a-v/W5500lwIP#17
* enable lwip_hook_dhcp_parse_option()
* +ethernet tcp client example in w5500 library examples
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants