Tuesday, December 18, 2007

Back to the binaries! Yeah!

After all this XML work the binary file formats are a different world. For the fields work I needed to analyze the “form field” structure of the binary .DOC format:

The header: Actually a misused PICT structure:

b10

b16

field

Type

size

bitfield

comments

0

0

lcb

U32

Count of bytes of the whole block.

4

4

cbHeader

U16

Always 0x44

6

6

U8[62]

Contains zero. In fact this is the PICT struct, but since its not need we can fill it with zeros.

The formfield payload (Unicode Variant)

b10

b16

Field

Type

size

bitfield

comments

0

0

cUnicodeMarker

U8[32]

Contains {0xFF,0xFF,0xFF,0xFF}

4

4

fftype

U8

:2

03

Type:

0 = Text

1 = Check Box

2 = List

ffres

U8

:5

7C

Result field for a form field. Values from 0 to N-1, where N is the number of \ffl entries.

In case of check boxes: 0==unchecked; 1==checked.

ffownhelp

U8

:1

80

1 if there is associated Help text, 0 otherwise.

5

5

ffownstat

U8

:1

01

1 if there is associated status line text, 0 otherwise.

ffprot

U8

:1

02

1 if this field is protected, 0 otherwise.

ffsize

U8

:1

04

Type of size selected for check box field:

0 = Auto

1 = Exact

fftypetxt

U8

:3

38

Type of text field:

0 = Regular text

1 = Number

2 = Date

3 = Current date

4 = Current time

5 = Calculation

ffrecalc

U8

:1

40

1 if the field should be calculated on exit, 0 otherwise.

ffhaslistbox

U8

:1

80

1 if this field has list box attached to it, 0 otherwise.

6

6

ffmaxlen

U16

:15

7FFF

Number of characters for text field. Zero means unlimited.

U16

:1

8000

Unknown. Set to zero.

8

8

ffhps

U16

Check box size (half-point sizes).

10

A

xstz_ffname

Xstz_UString0

Form field name

xstz_ffddeftext

Xstz_UString0

Default text for field. Only if type==0.

ffdefres

U16

Default resource for list field. Default value for check box (0=default unchecked; 1=default checked). Only if type!=0.

xstz_ffformat

Xstz_UString0

Format for text field

xstz_ffhelptext

Xstz_UString0

Help text

xstz_ffstattext

Xstz_UString0

Status line text

xstz_ffentrymcr

Xstz_UString0

Macro to execute upon entry into this form field

xstz_ffexitmcr

Xstz_UString0

Macro to execute upon exit from this form field

cUnicodeMarker2

U8[2]

Contains {0xFF, 0xFF}; Padding and/or indicator for Unicode?

fflLen

U32

Num of ffls

ffl

Xstz_UString[fflLen]

Resource string for lists.

An Xstz_UString has the following form:

b10

B16

Field

type

size

bitfield

Comments

0

0

Len

U16

Len of the String.

2

2

Unicode char

U16[len]

Unicode chars

An Xstz_UString0 has the following form:

b10

B16

Field

type

size

bitfield

Comments

0

0

len

U16

Len of the String.

2

2

Unicode char

U16[len]

Unicode chars

2+2*len

Zero

U16

Trailing “0”

In case of non-Unicode encoding then the Unicode Marker disappear and the string chars have U8 size.

You might also want to take a look at the ffData element in OOXML ;-)