【问题标题】:Parsing pdftk dump_data_fields using PHP?使用 PHP 解析 pdftk dump_data_fields?
【发布时间】:2016-04-24 02:43:40
【问题描述】:

我需要一些关于使用 PHP 解析 pdftk dump_data_fields 给出的输出的最佳方法的建议?

此外,我需要提取的属性是:FieldNameFieldNameAlt 以及可选的FieldMaxLengthFieldStateOptions

FieldType: Text
FieldName: TestName1
FieldNameAlt: TestName1
FieldFlags: 29360128
FieldJustification: Left
FieldMaxLength: 5
---
FieldType: Button
FieldName: TestName3
FieldFlags: 0
FieldJustification: Left
FieldStateOption: Off
FieldStateOption: Yes
---
...

【问题讨论】:

    标签: php parsing text-parsing pdftk


    【解决方案1】:

    这样就够了吗?

    $handle = fopen("/tmp/bla.txt", "r");
    if ($handle) {
        $output = array();
        while (($line = fgets($handle)) !== false) {
            if (trim($line) === "---") {
                // Block completed; process it
                if (sizeof($output) > 0) {
                    print_r($output);
                }
                $output = array();
                continue;
            }
            // Process contents of data block
            $parts = explode(":", $line);
            if (sizeof($parts) === 2) {
                $key = trim($parts[0]);
                $value = trim($parts[1]);
                if (isset($output[$key])) {
                    $i = 1;
                    while(isset($output[$key.$i])) $i++;
                    $output[$key.$i] = $value;
                }
                else {
                    $output[$key] = $value;
                }
            }
            else {
                // handle malformed input
            }
        }
    
        // process final block
        if (sizeof($output) > 0) {
            print_r($output);
        }
        fclose($handle);
    }
    else {
        // error while opening the file
    }
    

    这将为您提供以下输出:

    Array
    (
        [FieldType] => Text
        [FieldName] => TestName1
        [FieldNameAlt] => TestName1
        [FieldFlags] => 29360128
        [FieldJustification] => Left
        [FieldMaxLength] => 5
    )
    Array
    (
        [FieldType] => Button
        [FieldName] => TestName3
        [FieldFlags] => 0
        [FieldJustification] => Left
        [FieldStateOption] => Off
        [FieldStateOption1] => Yes
    )
    

    找出这些值很简单:

    echo $output["FieldName"];
    

    【讨论】:

    • 我怎样才能给你买咖啡? :D
    【解决方案2】:

    我已经对上面的代码进行了一些修改,并修复了一些问题,例如最后一个元素字段没有进入数组。现在更新的代码如下所示。

            // Get form data fields 
            $fieldsDataStr = '';
            $fieldsDataStr = $pdf->getDataFields();
    
        /* explode by \n and convert string into array. */
        $lines = explode("\n", $fieldsDataStr);  
        /* added '---' into end of lines array beucase we need to get last field value also based on below logic. */
        array_push($lines, "---");
    
        $output = array();
        $pdfDataArray = array();
        $counterField = 0;
        foreach($lines as $line) {
        if (trim($line) === "---") {
            // Block completed; process it
            if (sizeof($output) > 0) { 
            $pdfDataArray[] = $output;
            $counterField = $counterField + 1; //fields counter
            }
            $output = array();
            continue;
        }
        // Process contents of data block
        $parts = array();           
        $parts = explode(":", $line, 2); //2 is return array max limit, it will return array with first occurence of colon          
        if (sizeof($parts) === 2) {
            $key = trim($parts[0]);
            $value = trim($parts[1]);
            $output[$key] = $value;
        }   
            }
    
        print_r($pdfDataArray);
    

    它将返回正确的数组

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2014-09-02
      • 2013-09-16
      • 1970-01-01
      • 1970-01-01
      • 2012-02-17
      • 2016-09-19
      • 2010-11-10
      • 2010-12-03
      相关资源
      最近更新 更多