Read Excel file without signature in PHP -


question: how 1 read or modify excel file without signature allow php parse properly?

for project, want automatically download , read excel file national volleyball association (nevobo) using php. downloading goes fine. reading not. issue seems related fact there's no signature in first 8 bytes tell phpexcel ole file, such phpexcel identifies csv file, not. excel can open file force me save in different format.

i have downloaded files same source (different content though), lack signature. on these files have managed filter control characters (\x00 thru \xff) in php , automatically create new row when sees date (since in column a), unfortunately didn't work file.

function cleanpart ( $part ) {     $part = trim(preg_replace('/[\x00\x01\x03-\x0a\x0d-\x1f\x80-\xff]/', '', trim($part, ' ')), ' ');     $part = preg_replace('/\x0b/', "\x0c", $part);     $part = preg_replace('/\"/', "\x0c", $part);     $part = preg_replace('/\x0c+/', "\x0c", $part);     $part = preg_replace('/\x0c\x02/', "\x0c", $part);     if ( $part == "\x02\x0c" || $part == "\x02\x0b" ) return false;     $part = trim(preg_replace('/[\x00-\x1f\x80-\xff]/', "\x02", $part), ' ');     $part = trim(preg_replace('/\x02+/', "\x02", $part), ' ');     $part = trim(preg_replace('/[\x00\x01\x03-\x1f\x80-\xff]/', '', $part), ' ');     if ( strlen($part) == 0 ) return false;     $part = trim(preg_replace('/\x02/', "", $part), ' ');      return $part; }  foreach ( explode("\x04", preg_replace('!\x04+!', "\x04", $data)) $part ) {     if ( ! ( $part = cleanpart($part) ) )     {         continue;     }      // create array } 

libreoffice read file excel file, must known format libreoffice, if file magic identifies apple basic (!) , other utilities targa (which means little more "binary data length multiple of three").

however, this delimited text format. possibly word processor format , strange characters control characters tabulation , typefacing?

to convert more reliably in csv type, can replace control sequences tabulations, skipping first 12 characters. control sequences appear 12 bytes long, prefixed \x04 \x02, so:

$clean = preg_replace('#\\x04\\x02..........#ms', "\t", substr($dirty, 24)); 

(i have skipped first control sequence too, giving 12+12 = 24 byte skip).

you can split field chunks, php csv parse functions should able work, 20 fields per row.

i cannot use csv parse using sequences delimiter because sequences different throughout file. include carriage returns, forces use whitespace/line modifier in regex.

this parser appears work:

<?php $clean = preg_split(     '#\\x04\\x02..........#ms',     substr(file_get_contents('excelgen.xls'), 24) ); $rows  = array(); while (!empty($clean)) {     $rows[]   = array_splice($clean, 0, 20); } // $header = array_shift($rows); print_r($rows); 

yields:

array ( [0] => array     (         [0] => datum         [1] => tijd         [2] => team thuis         [3] => team uit         [4] => locatie         [5] => veld         [6] => regio         [7] => poule         [8] => code         [9] => zaal code         [10] => zaal         [11] => plaats         [12] => eerste scheidsrechter         [13] => tweede scheidsrechter         [14] => rapporteur / begeleider / jurylid         [15] => lijnrechter 1         [16] => lijnrechter 2         [17] => lijnrechter 3         [18] => lijnrechter 4         [19] => reserve ... ... [54] => array     (         [0] => 2016-04-23         [1] => 19:30         [2] => ecare apollo 8 hs 1         [3] => lycurgus hs 2         [4] => de veste, borne         [5] => 1         [6] => nationaal         [7] => 1ah         [8] => al         [9] => bneve         [10] => de veste         [11] => borne         ...     ) 

Comments

Popular posts from this blog

sublimetext3 - what keyboard shortcut is to comment/uncomment for this script tag in sublime -

java - No use of nillable="0" in SOAP Webservice -

ubuntu - Laravel 5.2 quickstart guide gives Not Found Error -