News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

TCP/IP HTTP

Started by savage, August 03, 2006, 02:23:22 PM

Previous topic - Next topic

savage

Hi, I'm using the wsock32 library, and I am a beginner at network programming. 
I'm making progress, but I'm getting some errors that I can't work out.

I've read up on some TCP books, and I'm just trying to understand the ordinary functions such as socket, connect, gethostbyname, send, and recv.
So far, I've made a simple Client-side TCP program with 2 input boxes, one for the IP address to send to, and one for the text to send.  Then I have one output text box to display the results from the server. I've actually been able to connect and send this text:
Quote
GET / HTTP/1.1


To the domain www.google.com (64.233.167.147) and get a result back (e.g. HTTP/1.1 200 OK, stuff like that).

Okay...so something is working, at least.


But I got lucky on google.  Almost no other sites work.
Here's the total list of TCP-related functions I'm calling in my program, in general order:
Quote
WSAStartup
inet_addr
gethostbyname
inet_ntoa
htons

socket
connect <-----
send
recv
closesocket

WSACleanup

The function I'm pointing out here is connect, since It's where my problem is at.
Most servers (other than google, for some reason) are not responding when I try to connect.
So the function never returns, until it times out (about 20 seconds).

Here is what my connect looks like... seems to be normal from what I read in the books.

invoke connect, hSocket, Offset mySocketAddr, Sizeof mySocketAddr


The same exact thing happens when I try to connect to a non-existant IP address,
so my first thought was that I am using the DNS lookup (gethostbyname) incorrectly.
But then that gives no explanation for why www.google.com is working.

Can anyone think of any tips from experience they've had with this issue?
I'm sure I'm not the first person to scratch my head with this; internet stuff isn't
the easiest to debug =(.

Tedd

A minimal request would usually be

GET / HTTP/1.1
Host: www.somesite.com
Connection: close


(replace "www.somesite.com" with the site you're requesting from -- even if it's the same one you're connected to)
And don't forget the extra blank line at the end - to say that's the end of the request. [And all newlines must be CR+LF]

This seems to work with most sites, but some will be picky and mess up unless you provide a "User-Agent:" field and sometimes "Referer:"

Happy playing :wink
No snowflake in an avalanche feels responsible.

James Ladd

savage,

You dont mention setting the port but Im sure you are.

If you are going to send http requests to a server then you should read a little about the protocol and
ensure you send what the server expects. This is playing nice with the server and it should play nice
back.

savage


Thanks guys.
I've noticed that if I send something like "GET /path/file.html HTTP/7.0", google doesn't respond, which was a hint to me that servers don't always respond for an invalid HTTP request.

So perhaps it's just that google is more forgiving than other sites.  Perhaps some servers are just stuck up and won't talk to me unless I'm polite  :snooty: .   lol...



Quote from: Tedd on August 03, 2006, 04:23:24 PM
A minimal request would usually be

GET / HTTP/1.1
Host: www.somesite.com
Connection: close


(replace "www.somesite.com" with the site you're requesting from -- even if it's the same one you're connected to)
And don't forget the extra blank line at the end - to say that's the end of the request. [And all newlines must be CR+LF]

This seems to work with most sites, but some will be picky and mess up unless you provide a "User-Agent:" field and sometimes "Referer:"

Happy playing :wink

You are probably right.  Still, I can't get it to work.
Could you please show me an http request that would work for a site like, for example, www.yahoo.com?




P.S.  What in the $%#@$%@#$% is the point of the "Host: www.somesite.com" line?? Who came up with this?? It is so redundant!!!!!!!!!!!

Google:  :U
Yahoo:  :snooty:

P1

Quote from: savage on August 04, 2006, 12:47:50 PMP.S.  What in the $%#@$%@#$% is the point of the "Host: www.somesite.com" line?? Who came up with this?? It is so redundant!!!!!!!!!!!
It is to facilitate multi-site hosting on a single Host Server.

Regards,  P1  :8)

savage

Actually, I can't even connect to yahoo, so i don't think the HTTP segment is even used yet.

nevermind, I fixed that =P

But yahoo won't respond properly when I send still

Shantanu Gadgil

Just out of curiosity...where is this going? You making a download manager software?  :bg :bg

Cheers,
Shantanu
To ret is human, to jmp divine!

P1

There is always room for better mouse traps,  aaaahhhhh, web browsers.  Do web broswers trap mice?   :lol

Regards,  P1  :8)

arafel

Quote from: savage on August 04, 2006, 12:47:50 PM
...
Could you please show me an http request that would work for a site like, for example, www.yahoo.com?

GET / HTTP/1.0
Host: www.yahoo.com
Range: bytes=0-
User-Agent: test1234567
Connection: Close
Accept: */*


You might want to read the RFC 2068.

Tedd

I tested my minimal request on yahoo.com and it worked.. (see below)

Don't send "Range:" unless you intend to use it :P (and servers don't actually 'have' to support it)


Want to post your app? :wink
No snowflake in an avalanche feels responsible.

Tedd

Attached: Example of data sent/recevied in connecting to yahoo.com

The html page received basically tells you to use a newer browser (because there was no recognised user-agent) -- which is why you need to experiment with the fields to see what works for your purpose.


HTTP/1.1 http://www.ietf.org/rfc/rfc2616.txt
(light reading :bdg)

[attachment deleted by admin]
No snowflake in an avalanche feels responsible.

savage

Well what I am doing is trying to learn how to use TCP, ... my program doesn't really do anything other than just test stuff.

One thing that's throwing me off is the way hostent is defined.

Windows.inc:

hostent STRUCT
  h_name      DWORD      ?
  h_alias     DWORD      ?
  h_addr      WORD       ?
  h_len       WORD       ?
  h_list      DWORD      ?
hostent ENDS


MSDN: (i changed the C++ to assembly, since all I'm trying to point out is the difference in names of the struct elements)

hostent STRUCT
  h_name      DWORD      ?
  h_aliases   DWORD      ?
  h_addrtype  WORD       ?
  h_length    WORD       ?
  h_addr_list DWORD      ?
hostent ENDS


Obviously, I can edit Windows.h to look like the standard names, but then other projects may clash from depending on them.
Or I could go with the Windows.h name, but it's hard and annoying to translate when I'm reading a book on TCP and it's telling me to use the "h_addrtype" member of the struct and none exists.  Obviously this is something that can be swatted away with a "deal with it," but I think maybe Windows.h oughta be updated in future versions.  I say I *think maybe* because I'm not sure if I'm seeing the whole picture, as I'm a beginner with this internet programming stuff.

In the mean time, I'll touch up my source code to be readable (I was using some little shortcuts for certain things, and since they very well may be part of the problem, I'll standardize my code before I ask you guys to spot the problem =P )

Thanks a lot for the support, keep it up.  :bg

savage

OOOOOOOOOOOOOOOoooooooookkkkkkkkkkkkkkaaaaaaaaayyyyyyyyy!   :bg

I solved the problem.

Apparently I was sending out a message, but giving it the wrong size.
So basically there were a bunch of zeroes trailing my message.
Which was not good.

I looked to see if perhaps I had some extra zeros on the end of my sender, perhaps an unwanted null terminator that snuck in there, so I decided to find out character for character what I was sending out.
I realized I was sending out a string but saying it was size "500"; you see, I previously had assumed I was typing in the "maximum" buffer size like I do with the recv function, so I typed in an arbitrarily large value (500). but obviously the function has no way of knowing where my string ends so it was just sending out whatever junk was there to fill up the 500 characters.

Now it seems to work.  Very pleasant.

[attachment deleted by admin]

Mark Jones

"To deny our impulses... foolish; to revel in them, chaos." MCJ 2003.08