What they mostly should know: TCP provides a bidirectional stream of bytes on th... (2024)


What they mostly should know: TCP provides a bidirectional stream of bytes on the application level. It does NOT provide a stream of packets.

That means whatever you pass to a send() call is not necessarily the same amount of data the receiver will observe in a single read() call. You might get more or less bytes, since the transport layer is free to buffer and to fragment data.

I have seen the assumption of TCP having packet boundaries on application level being made too often - typically in stackoverflow questions like: „I don’t receive all data. Is my OS/library broken?“

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (1)

> What they mostly should know: TCP provides a bidirectional stream of bytes on the application level. It does NOT provide a stream of packets.

> That means whatever you pass to a send() call is not necessarily the same amount of data the receiver will observe in a single read() call.

Yes, this. For god's sake, listen to them.

I had to fight a coworker on this. I had quickly created some client code just to validate that the server was working. Due to some quirk, all the messages were arriving in full in every read call. He told me to ship it.

I said no! "I need to check if there's more data and if so add a loop to read again" "But it is working, release it". That went on for a while, to no avail. Wouldn't look at documentation either.

Eventually he head to leave for the day, and I took the time to implement it correctly.

I started including basic TCP questions on interviews. Not many people even get past the TCP handshake (if they even know about that).

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (2)

scott_s on May 15, 2020 | parent | next [–]


The problem here was not a lack of knowledge of a particular subject. The problem is that this person was unwilling to learn about a thing they thought they knew.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (3)

draw_down on May 15, 2020 | root | parent | next [–]


That's correct.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (4)

caseymarquis on May 16, 2020 | parent | prev | next [–]


I feel like I can Google all the syn/ack/packetsniffing bits when those come up during troubleshooting (Why is this disconnecting? What do you mean the gateway sends out rst packets when there's no activity for 5 minutes?!?). Seems a bit harsh to start with those. The guarantees about the protocol are the important part, the rest seems kind of superfluous unless there's an unusual problem or you're pushing the limits of a network.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (5)

dharmab on May 16, 2020 | root | parent | next [–]


I've personally been on many outage troubleshooting calls that cost hundreds of thousands of dollars precisely because engineers believed they could ignore the underlying details and corner cases of TCP. My favorite common mistake is that developers assume they can open a single connection with no retries and that connection will be reliable as long as the handshake succeeds.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (6)

SilasX on May 15, 2020 | parent | prev | next [–]


Stupid question: why would you be writing code that works at the level of TCP? Don't you usually want to use the OS's (or some popular library's) TCP software stack?

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (7)

jfkebwjsbx on May 15, 2020 | root | parent | next [–]


It seems to me GP is talking about using TCP, not implementing it.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (8)

SilasX on May 15, 2020 | root | parent | next [–]


Right, I mean I thought that the TCP protocol implementation itself handled that issue, and your calls to such a library abstracted away from that.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (9)

jacobolus on May 16, 2020 | root | parent | next [–]


No, it does not, by design. When you use any standard TCP implementation the abstraction provided at both ends is just a stream of bytes. The guarantee TCP makes is that the bytes received at one end will be in the same order as the bytes that were sent at the other end.

If you want to use TCP to implement some (higher-level) message-based protocol, you need to parse those out for yourself.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (10)

SilasX on May 16, 2020 | root | parent | next [–]


The original question didn't refer to such issues of parsing, just checking for whether the full message was received, which, I would think, would be handled by the TCP library read() (or whatever) call. It sounded like the OP (outworlder) was delving into lower-level TCP details that should have been abstracted away -- my thinking is that the TCP caller shouldn't have to concern itself with details about checking whether the full message has been received, at least not as a separate step. That is, it just wouldn't return anything until the full message is received, or would include some data structure that indicates it's not complete. Does that make sense?

Edit: On second thought, I guess OP meant that all of the results were coming back "complete" which doesn't obviate the issue of needing to do a check that handles the "not done" case.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (11)

jacobolus on May 16, 2020 | root | parent | next [–]


> the TCP caller shouldn't have to concern itself with details about checking whether the full message has been received

The caller absolutely must concern itself with this crucial “detail”. If you do otherwise, your code is broken, full stop.

You can implement a higher-level protocol on top which handles this kind of thing internally and presents a higher-level interface (e.g. not passing any partial data along to its caller until a full message has been received), but if you are just working with TCP directly, what you get is just a stream of bytes. The guarantee you get is that the bytes will be in order and without any gaps.

If you e.g. send UTF-8 encoded text, you must be prepared on the read side to have your stream of bytes cut off arbitrarily in the middle of a character.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (12)

SilasX on May 16, 2020 | root | parent | next [–]


Let function A call TCP read() and pass back a struct that includes the bytes it's received and a flag that indicates whether it's read to the end of the message.

Let function B call TCP read() and never return anything until it's received all the bytes of the message.

Both of those seem (IMHO) like functions you could have in a TCP library. Neither seems (IMHO) like a higher level protocol.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (13)

yuribro on May 16, 2020 | root | parent | next [–]


The reason that this function doesn't exist is that even if the sender sent a "message" in a single API call, there is no guarantee that the networking code will send it in one IP packet (or one layer 2 frame). We don't want to couple the message size with the lower network protocol, the type of network equipment and so on, and also want to allow merging of messages for more efficient use of the network (If we have a large window size it allows better throughput).

So the job of separating the stream into messages is left to the application layer. (unlike for example in UDP, but then you have to worry about dropped messages)

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (14)

SilasX on May 16, 2020 | root | parent | next [–]


What is that responding to? Of course TCP can split a message across multiple packets of the lower level protocol; that has no bearing on the concepts TCP works in and whether it can indicate end-of-message.

If your point is just that TCP doesn't have a concept of a "message" (a bytestream with a clear beginning and ending), then that's fair, but, as I said elsewhere [1] the original comment took for granted that TCP does have a well defined notion of "you've reached the end of the message", or at least, "there is no further data to receive". No one seemed to have a problem with that there, and I was just working off that assumption.

As before, I haven't checked whether this is true (I can't quickly verify from descriptions of TCP).

And, interestingly enough, there's this comment [2], which says that what I described does exist, but isn't the default. So ... I'm at least not getting a consistent answer to my question, and people who think they know what they're talking about are inconsistent with each other.

[1] https://news.ycombinator.com/item?id=23200293

[2] https://news.ycombinator.com/item?id=23200906

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (15)

jacobolus on May 16, 2020 | root | parent | next [–]


> the original comment took for granted that TCP does have a well defined notion of "you've reached the end of the message"

No, the original comment was complaining about a coworker who didn’t understand (and refused to listen when told otherwise) that there is no such notion in TCP. It was a response to another comment complaining about people on the internet (e.g. Stack Overflow) too often making the same mistake.

You’re more or less playing the part of that coworker here. It’s unclear why.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (16)

yuribro on May 17, 2020 | root | parent | prev | next [–]


> Both of those seem (IMHO) like functions you could have in a TCP library. Neither seems (IMHO) like a higher level protocol.

It's responding to the last part. It is a higher level of protocol. In the traditional TCP/IP model, it's in the application level. There are many libraries with an API like you asked, they are just in a higher level.

(and TCP_WAITALL is a partial solution, applicable only if you know in advance the exact size of the message you are about to receive)

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (17)

Matthias247 on May 16, 2020 | root | parent | prev | next [–]


There is no "message" in TCP. So no - you can not have this in a library.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (18)

scott_s on May 18, 2020 | root | parent | prev | next [–]


If it makes you feel better, calling read() on a TCP socket is basically the same as calling read() on a file in a file system. In both cases, you can always end up reading less than you expected, and you always must check how much you actually read. In practice, this means that calls to read() for both should always be in a loop.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (19)

jfkebwjsbx on May 16, 2020 | root | parent | prev | next [–]


Those are perfect examples of a higher level protocol, since there is no way you can do it only with TCP.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (20)

jayshua on May 16, 2020 | root | parent | prev | next [–]


That might be nice. But that's not how it works.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (21)

Nursie on May 16, 2020 | root | parent | prev | next [–]


> which, I would think, would be handled by the TCP library read()

You would be wrong.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (22)

SilasX on May 16, 2020 | root | parent | next [–]


Thanks for the explanation, but I don't see why a library utility function wouldn't do that.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (23)

spydum on May 16, 2020 | root | parent | next [–]


But if the goal is to process or retrieve a string of bytes, how do you know when you've got them all? That is the root of the problem: tcp isn't built to exchange messages, it's just the stream transport layer. If you want messages you have to encode and decide that yourself.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (24)

SilasX on May 16, 2020 | root | parent | next [–]


Sorry, the rest of the conversation seemed to be assuming that "whether you have received the full message" is well-defined at the level of TCP, as suggested by the original comment where I joined[1]:

>I had to fight a coworker on this. I had quickly created some client code just to validate that the server was working. Due to some quirk, all the messages were arriving in full in every read call. He told me to ship it.

>I said no! "I need to check if there's more data and if so add a loop to read again" "But it is working, release it". That went on for a while, to no avail. Wouldn't look at documentation either.

I was just going along with that.

I didn't check up on TCP further to verify whether this was actually true; if you are saying that's not a well-defined concept, you might want to reply to that comment to say so.

[1] https://news.ycombinator.com/item?id=23195230

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (25)

jfkebwjsbx on May 16, 2020 | root | parent | next [–]


Those quotes don't claim messages are received in full, you are misinterpreting them.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (26)

SilasX on May 16, 2020 | root | parent | next [–]


Of course, but they do indicate that "there is still more of the message to receive" is a well-defined concept (or at least "there is more data to receive" is), which is all my point requires.

(Edit: also, it would help if you said what the correct interpretation would be, since it has the phrase "messages were arriving in full in every read call".)

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (27)

Nursie on May 16, 2020 | root | parent | next [–]


>but they do indicate that "there is still more of the message to receive"

Not at the TCP level they don't.

They just give you some bytes. It's up to you to decide whether you have the full message or not and if you want to try to read more. The TCP read functions just give you the data they have. There is no concept of one write at the sender end translating to a complete message at the other end. It's just a stream of bytes.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (28)

anonymoushn on May 17, 2020 | root | parent | prev | next [–]


That poster's coworker was implementing a message-oriented protocol and testing the client and server on the same machine. When running the software in this configuration, the coworker observed that each read returned abd entire message, even though this was not going to be true in other configurations where the client and server are different machines or the messages are larger.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (29)

anonymoushn on May 16, 2020 | root | parent | prev | next [–]


Most TCP libraries simply do not come with a function that does "read, but only give me 0 bytes or n bytes, and if you get some other number please hang on to the leftovers until next time". I guess I could follow your advice, but the first step of doing that would be to write such a function again.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (30)

userbinator on May 16, 2020 | root | parent | next [–]


They do... with an option called MSG_WAITALL.

That naturally leads to the question of why it's not the default, which can be answered by understanding the history of TCP and computer networking; and more interestingly, how might things have been different if MSG_WAITALL was the default from the beginning.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (31)

anonymoushn on May 16, 2020 | root | parent | next [–]


I don't think this gets you nonblocking all-or-nothing recv.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (32)

bavell on May 16, 2020 | root | parent | prev | next [–]


If you're using a library, then sure. But if you're just reading from a raw TCP socket, it's just a stream of bytes. It's up to your application to parse those bytes (e.g. into a http request).

The OS will buffer bytes received from TCP packets for you until you read from the socket again to drain the buffer. Your application needs to determine how to semantically chop those bytes up into the protocol it's expecting (e.g. http request).

My low-level networking chops are a little rusty so please correct my understanding if I'm off-base somewhere.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (33)

Ididntdothis on May 15, 2020 | parent | prev | next [–]


"But it is working, release it".

Famous last words :-)

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (34)

austincheney on May 15, 2020 | root | parent | next [–]


Sounds like how most software handles security until it’s audited.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (35)

lalaland1125 on May 16, 2020 | parent | prev | next [–]


Unless you are writing TCP drivers I don't see how knowing the exact TCP handshake is useful for software development.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (36)

Dylan16807 on May 16, 2020 | parent | prev | next [–]


> the rest of the conversation seemed to be assuming that "whether you have received the full message" is well-defined at the level of TCP

I don't see anything in the comment you linked that implies that.

Sorry you got confused, but there's no need for anyone to go reply to that comment to say so.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (37)

anilakar on May 15, 2020 | prev | next [–]


In the fall of 2016 I had a lengthy email exchange with an industrial automation vendor who didn't understand this issue. I even mailed them a short Python proof-of-concept snippet that slept a few milliseconds between the write() calls and in response got back my code "fixed" with the sleep removed.

In between the emails I googled a bit and found the changelogs for the RTOS they were using. Turned out that it was a bug in the upstream HTTP server. This also meant that the platform they were using had all the security holes from those five-plus years. The bug was later silently fixed when they acquired a newer release from upstream.

Currently I'm having a similar issue with the very same vendor. This time they don't understand why client-side authentication means no authentication at all and why passwords must not be stored in plain text in the database that can be remotely backed up from the device.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (38)

irrational on May 15, 2020 | parent | next [–]


Why don't you tell us the vendor's name? It seems like the responsible thing to do.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (39)

anilakar on May 15, 2020 | root | parent | next [–]


Even after the bug gets fixed, it'll probably take years for all the embedded devices in the public internet to get patched, so no.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (40)

laughinghan on May 15, 2020 | root | parent | next [–]


But in the meantime, won't the vendor keep adding more broken devices to the public internet, making the problem worse?

The longer it takes for this problem to become public, won't the more harm be caused when it does become public?

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (41)

loopz on May 15, 2020 | root | parent | next [–]


We're just waiting for the free market to kick in.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (42)

outworlder on May 15, 2020 | parent | prev | next [–]


> This time they don't understand why client-side authentication means no authentication at all

I've seen this... with an intern! I can't imagine dealing with a whole team like that.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (43)

throwaway_pdp09 on May 15, 2020 | parent | prev | next [–]


How do you not kill these people? How do you put up with it? How do vendors like this survive?

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (44)

maartenh on May 15, 2020 | root | parent | next [–]


Just like in nature, they survive because they are good enough, and don't experience enough competition to be eliminated by selection.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (45)

jolmg on May 15, 2020 | root | parent | next [–]


Depending on what kind of vendor we're talking about, it might be that such aspects aren't even part of what makes them competitive. The average user is not going to know about these types of issues, and so they're not even going to consider such issues when evaluating the vendor.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (46)

the8472 on May 15, 2020 | root | parent | prev | next [–]


full disclosure could put some selective pressure on them.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (47)

heavenlyblue on May 15, 2020 | root | parent | next [–]


I would assume the “free market” here is that these companies will over-extend themselves so much that they will no longer be able to hide the bugs from the malicious parties and their devices will start getting hacked en masse.

I would assume, however, that there is no law forcing minimal security so you can class A them, can you?

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (48)

ink_13 on May 15, 2020 | root | parent | prev | next [–]


Just about every industrial automation vendor is like this in my experience. They never upgrade because they don't want to break anything.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (49)

bsder on May 16, 2020 | root | parent | prev | next [–]


In inverse order:

Because nobody gives a sh*t about quality unless it hits their paycheck.

"Onnnngggg. They pay me hourly. Onnnngggg."

They cut lots of checks.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (50)

twotwotwo on May 15, 2020 | prev | next [–]


Yeah. Fun problem for beginners, because 1) your incorrect code may work for a while when reads/writes are small or it's only run on a local network or such, 2) you might design a broken protocol if you don't understand fragmentation, etc., which will tend to be harder than (say) an isolated client bug to fix, 3) the implementation-dependent nature of fragmentation can make it look like you hit a language/library/OS issue, 4) your language/library may or may not offer tools to help a beginner to implement a delimited or framed wire format properly (ideally with things like record-size limits and timeouts).

Not sure it says anything you haven't, but a StackOverflow answer on fragmentation (framed by asker as Go not behaving like C) is one of the more-read ones I've written: https://stackoverflow.com/questions/26999615/go-tcp-read-is-...

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (51)

wahern on May 15, 2020 | prev | next [–]


A version of Microsoft Exchange had a bug in its SMTP implementation that was tickled when lines crossed packet boundaries. (EDIT: The issue was more likely a bug in Exchange's TLS record processing, breaking when a logical line crossed TLS records.) My async SMTP library used a simple fifo for buffering outbound data which didn't realign the write pointer to 0 except when it was completely drained, so when reading slices (iovec's) from the fifo for write-out it would occasionally call write/send with an incomplete line (i.e. part of a line that wrapped around from the end of the fifo buffer array to the front) even if the application had only written full lines. (At the time it didn't support writev/sendmsg, though I'm not sure it would have helped as the TLS record layer might still have been prone to splitting logical lines across packets.) There was no bug here on my end--everything would be sent correctly--but you can't tell the customer that he can't send e-mail to some third-party because that third-party is using a broken version of Exchange.

The first quick fix was to unconditionally realign the fifo contents after every write (the fifo had a realign method), but that ran into a computational complexity problem when you had lots of small lines (e.g. the application caller dumped a huge message into the buffer and then flushed it out in one go) and a high-latency connection that resulted in many short writes; you were constantly memmove'ing the megabytes of remaining contents in the buffer for every tiny write you did. So then I ended up having to add a new interface to the fifo that returned a slice up to a limit but always ending with a specified delimiter (e.g. "\n") if the delimiter was within the maximum chunk size.

Of course, none of these fixes would have completely remedied the issue as lower layers (the TLS stack, the kernel TCP stack) could have still potentially split logical lines, and I'm sure did on occasion. But it at least seemed to put us on equal footing with everybody else in terms of how often it happened, which is really the best anybody could have done. Complaints did die down.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (52)

fenwick67 on May 15, 2020 | prev | next [–]


This probably bites lots of newbies, since when you're just sending traffic over localhost, the send()s and read()s tend to line up.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (53)

yjftsjthsd-h on May 15, 2020 | parent | next [–]


I have often wished for an "unhelpful testing environment" of sorts, to deal with these things before they get out of hand. It would feature a compiler that had creatively different interpretations of undefined behaviors, randomly compile against glibc and musl, have a base OS lovingly crafted from Ubuntu, but with most coreutils replaced with busybox and/or BSD versions. And, now, I suppose, it would have a customized network stack (kernel module?) that would randomly reorder/drop/duplicate packets, randomly reselect MTU on every boot, or maybe just randomly fragments things regardless of MTU. Ideally it would come with a FAQ of "my program broke on X; what did I do wrong?".

The idea being that if your software is actually written to relevant standards, and actually handles things properly outside the golden path, then it should still work fine. If, however, you accidentally did something implementation-defined, or that only worked by coincidence, this system will break it.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (54)

jeroenhd on May 15, 2020 | root | parent | next [–]


There are tools that intentionally insert failures into the network streams of applications. A few of them are described here: https://medium.com/@docler/network-issues-simulation-how-to-...

The other linking/OS problems can probably be automated with some simple integration tests and a bunch of different docker containers to compile the code in. Should be possible to squeeze it into a CI/CD flow somewhere with some clever tricks.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (55)

Matthias247 on May 15, 2020 | root | parent | prev | next [–]


I created such an environment for my unit-tests: Wrapping TCP sockets in a stream which only accepts 1 byte at a time in both directions and returns EAGAIN on every second read provides an easy way to make sure the code on top of the socket does perform all the correct retries.

That will most likely not help newcomers which directly write their code agains the OS socket. But once you get a better understanding of the topic and start adding tests to your codebase it's rather easy to add.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (56)

nitwit005 on May 15, 2020 | root | parent | next [–]


I've done something similar of forcing the sends to be a single byte at a time. That's usually enough to find the obvious issues in parsing data.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (57)

nicolaslem on May 15, 2020 | prev | next [–]


One way to stop falling into this trap is by knowing what happens behind the send syscall: the application is not sending bytes down the wire, it just fills a buffer in the OS. Once in the buffer there is no boundary between bytes from different send calls. Same thing for receiving, in reverse.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (58)

brlewis on May 15, 2020 | prev | next [–]


For me, at least in this decade, it would have been better if I didn't know that. I put off learning websockets longer than I should have because I don't find packet boundaries fun to deal with, and my interest in websockets was mainly for fun. Then when I finally picked websockets up I was pleasantly surprised that message framing is built in.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (59)

Unklejoe on May 15, 2020 | prev | next [–]


> stream of bytes

I've always wondered: What's the best/defacto way to delimit this back into packets at the application level on the receiving end?

I would think the obvious approach would be to insert some magic word into the stream so that you can re-sync.

Or is this not an issue since you know that once you're connected, you'll never drop a single byte, therefore, the only way to get out of sync would be a program error?

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (60)

jstanley on May 15, 2020 | parent | next [–]


You will never drop a single byte.

If you need some packet-oriented messaging, you could use something like http://jsonlines.org/ (i.e. JSON messages separated by newline characters), or https://github.com/protocolbuffers/protobuf if it's more performance-critical.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (61)

timeinput on May 15, 2020 | root | parent | next [–]


Protobuf isn't self delimiting so you still have to have some extra packet wrapper around it to say the length.

I like zeromq to get to a packet based system.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (62)

mytailorisrich on May 15, 2020 | parent | prev | next [–]


The standard way is to include explicit information on the length of the message that is following.

For example if the message is x bytes long then you first send 'x' then you send the x bytes of the message.

Or your messages have a defined header that contains the length of the message payload.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (63)

vasilvv on May 15, 2020 | parent | prev | next [–]


It will never get out-of-sync because TCP guarantees that the bytes will be delivered in the same order they've arrived.

The best approach is typically put a length in front of every message. The good things about that approach are:

1. The receiver can allocate buffer that is exactly the size it needs to fit the message.2. The receiver can check whether the message is too long before seeing the entire message.

The only disadvantage is that you have to know the length of all messages in advance.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (64)

kccqzy on May 15, 2020 | root | parent | next [–]


Definitely be sure to check the length though. Imagine a mistaken client trying to send HTTP, but of course the first four bytes "HTTP" when interpreted as a 32-bit integer, whichever endian, is an absurdly large buffer.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (65)

nucleardog on May 15, 2020 | root | parent | next [–]


I mean, at this point you're effectively defining a new network protocol (or, you will be shortly once you implement ways to work around all the other issues you're going to run into). I'd go all-in from the start and start every packet with a magic string/byte sequence of your own, a length, and probably a version code just to make it extensible.

Or see if there's an existing protocol you can abuse for what you want. If it's transactional, you get a pretty big ecosystem of battle-tested clients/servers/proxies/etc if you use HTTP.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (66)

userbinator on May 16, 2020 | root | parent | prev | next [–]


A 16-bit length (64k max message size) is usually sufficient, or even 24-bit (16M max) if you really feel the need, but 32 bits is far more than should be needed for parsing messages in memory; it would be fine for a streaming application, however (in which case a 64-bit length wouldn't be a bad idea either.)

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (67)

kccqzy on May 16, 2020 | root | parent | next [–]


Good advice. I was actually referring to this: https://rachelbythebay.com/w/2016/02/21/malloc/ I read this article a long time ago, and yet I made a similar mistake in my own code.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (68)

genpfault on May 15, 2020 | parent | prev | next [–]


What they mostly should know: TCP provides a bidirectional stream of bytes on th... (69)

patrickmcmanus on May 16, 2020 | parent | prev | next [–]


preamble of chunk length and 1 bit for end-of-message indicator.. if you only do chunk length you will eventually find you can't stream but want to.

or just use http.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (70)

keitmo on May 16, 2020 | prev | next [–]


I often tell people to assume the TCP stack's buffering is arbitrary & capricious and will do the most inconvenient thing for your code. That can mean ether a) dribbling data in one byte at a time per recv() call, or b) buffering multiple megabytes and returning it all in a single recv() call.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (71)

richardwhiuk on May 15, 2020 | prev | next [–]


If you do want that, then SCTP will provide it.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (72)

jes5199 on May 15, 2020 | prev | next [–]


if you turn off Nagle's algorithm, it gets closer to this though

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (73)

jfkebwjsbx on May 15, 2020 | parent | next [–]


No, it has nothing to do with that.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (74)

Animats on May 16, 2020 | root | parent | next [–]


I suppose I should say something.

The thing to turn off is delayed ACKs. See "TCP_QUICKACK". Delayed ACKs were a feature which is only useful for things like Telnet, where the payload in each packet is one character when the user is typing. The fixed timer for delayed ACKs is for keyboard typing speeds, and for networks so slow that human typing could congest them.There's a reasonably good explanation here.[1]

As others said above, TCP is not a message protocol. It's a stream protocol. If you're sending messages over a stream, you need something that's reading data from the stream, and when it has a full message, it send that off to be processed. There is no set of TCP options which will reliably cause one write at the sending end to result in one read at the receiving end. If there were, it would be inefficient for small messages and would fail for large ones.

[1] https://www.extrahop.com/company/blog/2016/tcp-nodelay-nagle...

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (75)

Your terminology is a little off. TCP does not provide anything for the application layer as it is transport layer. The application layer rides on top of that. Examples of transport protocols are TCP and UDP while application protocols are things like http, ssh, irc, and all those things your applications use.

The network layer on which the transport layer rides is packet switched. The TCP uses segments with each segment having its own header and sequence numbers. Streams are just a series of segments populating across a single established handshake without a prior defined termination segment.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (76)

Matthias247 on May 15, 2020 | parent [–]


I didn't meant to talk about OSI terminologies. It was more about: [user-space] applications which use the TCP/IP stack do not observe packet boundaries, whereas the Kernel certainly does. Obviously this is a bit ambiguous, and you can even get packet boundaries in user-space by running a TCP stack there. But for most TCP/IP usages it holds true.

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (77)

austincheney on May 15, 2020 | root | parent [–]


> It was more about: [user-space] applications which use the TCP/IP stack do not observe packet boundaries

That is still a bit imprecise. Userland applications won't directly see TCP as they are just looking at an application protocol. Typically it's the OS that packages and unpacks the application protocol data into a TCP segment, so of course the userland application won't see it since its not managing that part of the communication.

https://en.wikipedia.org/wiki/Transmission_Control_Protocol#...

There are some exceptions where some application platforms allow developers to write custom TCP protocols, such as Node.js, but these exceptions generally apply to network services and don't commonly apply to the end user application experiance.

https://nodejs.org/dist/latest-v14.x/docs/api/net.html#net_n...

What they mostly should know: TCP provides a bidirectional stream of bytes on th... (2024)
Top Articles
Disabling weak cipher suites in IIS
Why do coffee shops fail?
Riverrun Rv Park Middletown Photos
Pet For Sale Craigslist
Chris Provost Daughter Addie
Promotional Code For Spades Royale
Uti Hvacr
Holly Ranch Aussie Farm
Whiskeytown Camera
Bustle Daily Horoscope
2021 Tesla Model 3 Standard Range Pl electric for sale - Portland, OR - craigslist
Jscc Jweb
The Connecticut Daily Lottery Hub
TS-Optics ToupTek Color Astro Camera 2600CP Sony IMX571 Sensor D=28.3 mm-TS2600CP
Bfg Straap Dead Photo Graphic
Scenes from Paradise: Where to Visit Filming Locations Around the World - Paradise
Army Oubs
Georgetown 10 Day Weather
Wbiw Weather Watchers
Dr Ayad Alsaadi
Food Universe Near Me Circular
Deshuesadero El Pulpo
Jcp Meevo Com
Cylinder Head Bolt Torque Values
Cosas Aesthetic Para Decorar Tu Cuarto Para Imprimir
Yayo - RimWorld Wiki
031515 828
35 Boba Tea & Rolled Ice Cream Of Wesley Chapel
Ofw Pinoy Channel Su
O'reilly Auto Parts Ozark Distribution Center Stockton Photos
Blackstone Launchpad Ucf
Justin Mckenzie Phillip Bryant
A Man Called Otto Showtimes Near Carolina Mall Cinema
Steven Batash Md Pc Photos
2016 Honda Accord Belt Diagram
Ket2 Schedule
Bbc Gahuzamiryango Live
Los Garroberros Menu
Geology - Grand Canyon National Park (U.S. National Park Service)
Mcgiftcardmall.con
Mytime Maple Grove Hospital
Registrar Lls
Mychart Mercy Health Paducah
Ups Authorized Shipping Provider Price Photos
Southwest Airlines Departures Atlanta
Iman Fashion Clearance
Costner-Maloy Funeral Home Obituaries
Smoke From Street Outlaws Net Worth
Lux Funeral New Braunfels
2487872771
Okta Hendrick Login
Coors Field Seats In The Shade
Latest Posts
Article information

Author: Dean Jakubowski Ret

Last Updated:

Views: 6294

Rating: 5 / 5 (70 voted)

Reviews: 93% of readers found this page helpful

Author information

Name: Dean Jakubowski Ret

Birthday: 1996-05-10

Address: Apt. 425 4346 Santiago Islands, Shariside, AK 38830-1874

Phone: +96313309894162

Job: Legacy Sales Designer

Hobby: Baseball, Wood carving, Candle making, Jigsaw puzzles, Lacemaking, Parkour, Drawing

Introduction: My name is Dean Jakubowski Ret, I am a enthusiastic, friendly, homely, handsome, zealous, brainy, elegant person who loves writing and wants to share my knowledge and understanding with you.