upas/fs: fix handing of multi-line header fields (thanks theinicke)

Right now, upasfs exposes header lines as is, without stripping
out new lines. It also documents that it provides one header per
line in the info file.

As a result, when we get a mail with headers that span lines,
our tools get confused.

These split lines are not semantically meaningful. From RFC5322:

  2.2.3.  Long Header Fields

   Each header field is logically a single line of characters comprising
   the field name, the colon, and the field body.  For convenience
   however, and to deal with the 998/78 character limitations per line,
   the field body portion of a header field can be split into a
   multiple-line representation; this is called "folding".  The general
   rule is that wherever this specification allows for folding white
   space (not simply WSP characters), a CRLF may be inserted before any
   WSP.

As a result, to simplify processing, we should just strip out the
line separators when exposing the headers from upasfs.
This commit is contained in:
Ori Bernstein 2020-09-17 16:11:10 -07:00
parent 77b819d585
commit e3166b4fe8

View file

@ -732,6 +732,19 @@ rtrim(char *p)
return p;
}
static char*
unfold(char *s)
{
char *p, *q;
q = s;
for(p = q; *p != '\0'; p++)
if(*p != '\r' && *p != '\n')
*q++ = *p;
*q = '\0';
return s;
}
static char*
addr822(char *p, char **ac)
{
@ -760,7 +773,7 @@ addr822(char *p, char **ac)
for(p++; c = *p; p++){
if(ac && c == '"')
break;
if(!addrdone && !incomment)
if(!addrdone && !incomment && c != '\r' && c != '\n')
ps = sputc(ps, e, c);
if(!quoted && *p == '"')
break;
@ -883,7 +896,7 @@ replace822(Message *, Header *h, char*, char *p)
static char*
copy822(Message*, Header *h, char*, char *p)
{
return rtrim(strdup(skipwhite(p + h->len)));
return rtrim(unfold(strdup(skipwhite(p + h->len))));
}
static char*