Back to articles

In a first article we saw how to transform a PHP array and export it to a CSV file . This article is the opposite, it will allow you to generate a PHP array from a CSV file.

We will transform a CSV file containing this kind of data:

first name last name phone email
Bastien Malahieude 06 XX XX XX XX XX contact@bastienmalahieude.fr
John Doe 06 XX XX XX XX XX john@doe.fr

 

and return a PHP array like this one:

array(2) {
  [0]=>
  array(4) {
    ["first name"]=>
    string(7) "Bastien"
    ["last name"]=>
    string(10) "Malahieude"
    ["phone"]=>
    string(17) "06 XX XX XX XX XX"
    ["email"]=>
    string(28) "contact@bastienmalahieude.fr"
  }
  [1]=>
  array(4) {
    ["first name"]=>
    string(4) "John"
    ["last name"]=>
    string(3) "Doe"
    ["phone"]=>
    string(17) "06 XX XX XX XX XX"
    ["email"]=>
    string(11) "john@doe.fr"
  }
}

 

For that we will create an import_csv_to_array function which reads the data from a file and returns a PHP array. This function takes two parameters, but only one is mandatory:

function import_csv_to_array($file,$enclosure = '"')
{
// @TODO Do something here
}

At first, we will retrieve the contents of the file via the function  file_get_contents . 

$csv_string = file_get_contents($file);

This content is stored as a string in the $csv_string variable

Guess the CSV file delimiter

Most CSV import programs ask the user to choose which delimiter is on their file. Indeed, the data in a CSV file can be delimited by several strings. The most common are the semicolon (;) the comma (,) or the tab. It also happens in rare cases that the data is delimited by a pipe (|).

Depending on the size of the CSV file, it is likely that the delimiter used is the character of these four that is most present in the file. We will use this assumption to try to guess the separator used in the CSV file, and thus make the function compatible with as many files as possible.

For that, we will first create a table with the possible delimiters. Depending on your project you can add or remove some delimiters from the table:

// List of delimiters that we will check for
$delimiters = array(';' => 0,',' => 0,"\t" => 0,"|" => 0);

Note: If you already know which delimiter is used in your files, this step is useless.

Then, using the substr_count function, we will count the number of occurences of each of these delimiters, within the file content.

foreach ($delimiters as $delimiter => &$count) {
    $count = substr_count($csv_string,$delimiter);
}

Using &$count here allows you to add the result directly to the $delimiters array. You can also use  $delimiters[$delimiter]. More information here

This loop then returns an array containing the number of occurrences of each of the supposed delimiters within the file content.

The last step is to retrieve in the array the key for which the value  $count is the largest. Here we will use the array_search function  and the max function

array_search(max($delimiters), $delimiters);

Exporting the data from the CSV file

The second step of our function is to recover all the data from the file in order to add them to our PHP array.

For that, we will retrieve in a PHP array all the lines of the CSV file. In a CSV file, the line separator is always a new line  “\n”.  For this,  we will use the function explode.

$lines = explode("\n", $csv_string);

Our CSV file contains headers, these headers are fetched in the first line. The array_shift function retrieves the first row of an array.

Then we use the str_getcsv function. The latter is used to split a CSV line into a PHP array.

$head = str_getcsv(array_shift($lines),$delimiter,$enclosure);

Once the headers are recovered, we will create an array to store our output data. Then we can loop on all the other rows of the array with the procedure foreach

$array = array();
foreach ($lines as $line) {
// Gestion ici des lignes
}

Sometimes some files have the last empty line. For that, we check that the data exist in our line:

if(empty($line)) {
    continue;
}

If there is data, then we will export the contents of the line, in the same way as for the headers above:

$csv = str_getcsv($line,$delimiter,$enclosure);

Finally, using  array_combine we will be able to add the keys to the values ​​of each line that we have recovered. Then we can add it all in our main array $array the result of this function:

$array[] = array_combine( $head, $csv );

Finally, we return everything:

return $array;

FONCTION PHP COMPLÈTE

In order to make the code more understandable, I decided to cut the part that guesses the separator of the CSV file in a second dedicated detect_delimiter function:

The final function we have just created is the following:

/**
 *
 * This function allows you to import a CSV file and export it into a PHP array
 *
 * @param string $file      The file you want to import the data from
 * @param string $enclosure The type of enclosure used in the CSV file
 *
 * @return array            The array containing the CSV infos
 */
function import_csv_to_array($file,$enclosure = '"')
{

    // Let's get the content of the file and store it in the string
    $csv_string = file_get_contents($file);

    // Let's detect what is the delimiter of the CSV file
    $delimiter = detect_delimiter($csv_string);

    // Get all the lines of the CSV string
    $lines = explode("\n", $csv_string);

    // The first line of the CSV file is the headers that we will use as the keys
    $head = str_getcsv(array_shift($lines),$delimiter,$enclosure);

    $array = array();

    // For all the lines within the CSV
    foreach ($lines as $line) {

        // Sometimes CSV files have an empty line at the end, we try not to add it in the array
        if(empty($line)) {
            continue;
        }

        // Get the CSV data of the line
        $csv = str_getcsv($line,$delimiter,$enclosure);

        // Combine the header and the lines data
        $array[] = array_combine( $head, $csv );

    }

    // Returning the array
    return $array;
}

/**
 *
 * This function detects the delimiter inside the CSV file.
 *
 * It allows the function to work with different types of delimiters, ";", "," "\t", or "|"
 *
 *
 *
 * @param string $csv_string    The content of the CSV file
 * @return string               The delimiter used in the CSV file
 */
function detect_delimiter($csv_string)
{

    // List of delimiters that we will check for
    $delimiters = array(';' => 0,',' => 0,"\t" => 0,"|" => 0);

    // For every delimiter, we count the number of time it can be found within the csv string
    foreach ($delimiters as $delimiter => &$count) {
        $count = substr_count($csv_string,$delimiter);
    }

    // The delimiter used is probably the one that has the more occurrence in the file
    return array_search(max($delimiters), $delimiters);

}

The code is available Open source on the Github repo xusifob/lib

Feel free to leave your comments below

Share this article