http_server uploaded file handling #11

DartBot · 2015-06-05T22:32:24Z

Originally opened as dart-lang/sdk#14303

This issue was originally filed by [email protected]

Current http_body implementation decodes / parses HttpBodyFileUpload.content based upon the Content-Type header of the part.

However, file server applications need raw uploaded file data. Such applications save received files into their file system with no modification. I think we need a switch to disable decoding / parsing for all content types and just return type List<int> object as HttpBodyFileUpload.content.

Another solution would be to return simply List<int> for all uploaded files. I am not sure how much the impact of this on other kinds of applications. However, when a .json file was uploaded from HTML form, IE sends it as “Content-Type: text/plain” and Chrome, Firefox and Safari send it as “Content-Type: application/octet-stream” (I don’t know why). In any case, .json files uploaded through HTML form will never be parsed.

Regarding to the filename with multibyte characters, although Windows uses UTF-8, I think it might be safe to keep LATIN1 decoding. I am not familiar with other file systems. We can retrieve it using UTF8.decode(LATIN1.encode(part.filename)) with Windows, assuming that the LATIN1.decode(bytes) simply generates a String that has the same byte character to corresponding byte of bytes.

DartBot · 2015-06-05T22:32:24Z

<img src="https://avatars.githubusercontent.com/u/2909286?v=3" align="left" width="48" height="48"hspace="10"> Comment by madsager

cc @Skabet.
Added Area-IO, Triaged labels.

DartBot · 2015-06-05T22:32:24Z

<img src="https://avatars.githubusercontent.com/u/22043?v=3" align="left" width="48" height="48"hspace="10"> Comment by skabet

Set owner to @Skabet.
Removed Area-IO label.
Added Area-Pkg, Library-HttpServer, Accepted labels.

DartBot · 2015-06-05T22:32:24Z

<img src="https://avatars.githubusercontent.com/u/22043?v=3" align="left" width="48" height="48"hspace="10"> Comment by skabet

Hi Terry,

You are right, in most cases with file uploads, it's the raw binary one wants to access. What if we do the following:

Always provide the raw List<int> data.
Add a method to the FileUpload class: 'parsedData()', that will try and parse/decode the data depending on the mime type. We can even throw in a optional 'mineType' argument for it, so one can override the default mime type, e.g. parse as 'text/utf-8' instead of 'application/json'.

Regarding the filename, I think we should do a test and see what the different browsers upload. if we can hit a 90% success rate with some default encoding, that could be the way to go.

DartBot · 2015-06-05T22:32:25Z

<img src="https://avatars.githubusercontent.com/u/22043?v=3" align="left" width="48" height="48"hspace="10"> Comment by skabet

I just tried with both chrome and Windows, and I get the following:

With <meta charset="UTF-8" />:
- Chrome: as utf8
- IE: as utf8

Without <meta charset="UTF-8" />:
- Chrome: multi-bytes replaced with ?
- IE: as utf8

I think it's fine to use utf8-decoding for filenames.

DartBot · 2015-06-05T22:32:25Z

This comment was originally written by [email protected]

I confirmed it on my Windows Vista using following HTML text:

001 <!DOCTYPE html>
002 <html>
003 <head>
004 <title>file_upload_test</title>
005 <meta http-equiv="content-type" content="text/html; charset=UTF-8">
006 </head>
007 <body>
008 <form action="http://localhost:8080/DumpHttpMultipart"
009 enctype="multipart/form-data"
010 accept-charset="UTF-8"
011 method="POST"> <br>
012 What is your name? <input type="text" name="submitter"> <br>
013 What files are you sending? <input type="file" name="content"> <br>
014 <input type="submit" value="Send File">
015 </form>
016 </body>
017 </html>

If line 005 or 010 exists, Chrome, Firefox and Safari send filenames with multi-byte characters as UTF-8. Otherwise, such filenames are transmitted as Shit_JIS characters (one of most popular Japanese character encodings). Regardless of existence of line 005 or 010, IE sends them as UTF-8.

I agree to use UTF-8 decoding (current implementation uses ISO-8859-1 decoding) for filenames. It’s common to add line 005 for such applications.

DartBot · 2015-06-05T22:32:25Z

<img src="https://avatars.githubusercontent.com/u/22043?v=3" align="left" width="48" height="48"hspace="10"> Comment by skabet

Hi

What do you think about the following API?

/**
* A HTTP content body produced by [HttpBodyHandler] for either [HttpRequest]
* or [HttpClientResponse].
/
abstract class HttpBody {
  /**
    The actual data of the request.
   */
  List<int> get data;

  /**
   * Convert the data using mimeType.
   *
   * If mimeType is left unspecified, the Content-Type header will be used.
   */
  dynamic asMimeType({String mimeType});

  /**
   * Parse the [data] as text.
   *
   * If the headers contains a charset hint, that charset will be used.
   */
  String asText();

  /**
   * Parse the [data] as JSON.
   */
  dynamic asJSON();

  /**
   * Parse the data as either multipart/form-data or
   * application/x-www-form-urlencoded.
   *
   * The Content-Type header will be used to identify the parsing.
   */
  Map asFormPost();
}

/**
* The [HttpBody] of a [HttpClientResponse] will be of type
* [HttpClientResponseBody].
*/
abstract class HttpClientResponseBody implements HttpBody, HttpClientResponse {
}

/**
* The [HttpBody] of a [HttpRequest] will be of type [HttpRequestBody].
*/
abstract class HttpRequestBody implements HttpBody, HttpRequest {
}

/**
* A [HttpBodyFileUpload] object wraps a file upload, presenting a way for
* extracting filename, contentType and the data of the uploaded file.
/
abstract class HttpBodyFileUpload {
  /**
    The filename of the uploaded file.
   */
  String get filename;

  /**
   * The [ContentType] of the uploaded file.
   */
  ContentType get contentType;

  /**
   * The content of the file.
   */
  List<int> get content;
}

cc @sethladd.

DartBot · 2015-06-05T22:32:25Z

<img src="https://avatars.githubusercontent.com/u/5479?v=3" align="left" width="48" height="48"hspace="10"> Comment by sethladd

Thanks! I like how HttpRequestBody implements HttpRequest now. Also, I like how I can control how I get the body (json, text, etc) because sometimes a content-type is not set on the request.

DartBot · 2015-06-05T22:32:25Z

This comment was originally written by [email protected]

I think this will give us more flexible POST body data handling.

DartBot · 2015-06-05T22:32:26Z

<img src="https://avatars.githubusercontent.com/u/3276024?v=3" align="left" width="48" height="48"hspace="10"> Comment by anders-sandholm

Removed Library-HttpServer label.
Added Pkg-HttpServer label.

DartBot assigned andersjohnsen Jun 5, 2015

DartBot added bug Accepted labels Jun 5, 2015

DartBot mentioned this issue Jun 5, 2015

http_server uploaded file handling dart-lang/sdk#14303

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

http_server uploaded file handling #11

http_server uploaded file handling #11

DartBot commented Jun 5, 2015

DartBot commented Jun 5, 2015

DartBot commented Jun 5, 2015

DartBot commented Jun 5, 2015

DartBot commented Jun 5, 2015

DartBot commented Jun 5, 2015

DartBot commented Jun 5, 2015

DartBot commented Jun 5, 2015

DartBot commented Jun 5, 2015

DartBot commented Jun 5, 2015

http_server uploaded file handling #11

http_server uploaded file handling #11

Comments

DartBot commented Jun 5, 2015

DartBot commented Jun 5, 2015

DartBot commented Jun 5, 2015

DartBot commented Jun 5, 2015

DartBot commented Jun 5, 2015

DartBot commented Jun 5, 2015

DartBot commented Jun 5, 2015

DartBot commented Jun 5, 2015

DartBot commented Jun 5, 2015

DartBot commented Jun 5, 2015