Document Group Exports

PropertySync automatically generates full document group exports on a regular recurring basis for all customers. These exports are provided in a compressed sqlite file incorporating our standard data structures. The exports are generated weekly and are available for download at any time from your SFTP account. The exports include all documents, land validations and auto completes within the document group. Orders and order searches can optionally be provided on a case-by-case basis but require additional configuration. Please reach out to support for a custom export.

The standard format of the export is a gz compressed sqlite database file, which can be opened using any sqlite compatible database viewer once uncompressed. The database contains the following tables:

documents
landvalidations
autocompletes

For each table there are two fields, id and data. The data field contains the JSON structure of the document, land validation or auto complete as specified in the Object Data Structures section.

We have also created a large library of sql queries that can be run against the database to extract specific information or to generate reports. These queries can be run using any sqlite compatible database viewer. We regularly use and recommend TablePlus for this purpose.

You can use the export to generate mass-edit batches that can be pushed back into the platform using our API. Alternatively you can build customized reports against your plant data.

If you need help with the export or have any questions, please contact us at support@propertysync.com.

SQL Queries

The following is a list of a few SQL queries that can be run against the document group export database to extract specific information or to generate reports. If you need an assist, or have a specific use case, please contact our support team for help.

Add a tag to all documents

-- add a tag
WITH RECURSIVE 
  vars(tag_name) AS (
    VALUES('LOCATE')  -- Replace this value with whatever tag you want to add
  )
UPDATE documents 
SET data = json_patch(
  data,
  json_object(
    'json', json_object(
      'tags', json(
        CASE 
          WHEN NOT EXISTS (
            SELECT 1 
            FROM json_each(json_extract(data, '$.json.tags')) 
            WHERE value = (SELECT tag_name FROM vars)
          )
          THEN (
            SELECT json_group_array(value)
            FROM (
              SELECT value
              FROM json_each(json_extract(data, '$.json.tags'))
              UNION ALL
              SELECT tag_name FROM vars
            )
          )
          ELSE json_extract(data, '$.json.tags')
        END
      )
    )
  )
)
WHERE json_valid(data);

Count the number of documents with a specific tag

WITH RECURSIVE 
  vars(tag_name) AS (
    VALUES('LOCATE')  -- Replace with the tag you want to search for
  )
SELECT 
  COUNT(*) as document_count,
  json_group_array(json_extract(data, '$.id')) as matching_document_ids
FROM documents 
WHERE json_valid(data)
AND EXISTS (
  SELECT 1 
  FROM json_each(json_extract(data, '$.json.tags'))
  WHERE value = (SELECT tag_name FROM vars)
);

Count all documents

select count (id) from documents

Create batches for all documents

-- create batches for all documents with 100 documents per batch

WITH RECURSIVE
batch_numbers AS (
  SELECT 0 as batch_num
  UNION ALL
  SELECT batch_num + 1 
  FROM batch_numbers
  WHERE batch_num + 1 <= (SELECT (COUNT(*) - 1) / 100 FROM documents)
),
batch_rows AS (
  SELECT 
    ROW_NUMBER() OVER () as row_num,
    json(data) as data,  -- Parse the data as JSON instead of string
    (ROW_NUMBER() OVER () - 1) / 100 as batch_num
  FROM documents
)
SELECT json_object(
  'name', 'batch_' || (b.batch_num + 1),
  'documents', json_group_array(json(br.data))  -- Ensure nested JSON stays as JSON
) as batch
FROM batch_numbers b
JOIN batch_rows br ON br.batch_num = b.batch_num
GROUP BY b.batch_num
ORDER BY b.batch_num;

Find additions with no land validation

-- find all addition entries in the autocompletes list that do NOT have a matching entry in the landvalidations table
WITH authorized_additions AS (
  SELECT DISTINCT 
    json_extract(data, '$.id') as id,
    json_extract(data, '$.value') as addition_name
  FROM autocompletes
  WHERE json_valid(data)
  AND json_extract(data, '$.type') = 'addition'
)
SELECT 
  aa.id as autocomplete_id,
  aa.addition_name
FROM authorized_additions aa
WHERE NOT EXISTS (
  SELECT 1 
  FROM landvalidations lv
  WHERE json_valid(lv.data)
  AND json_extract(lv.data, '$.json.addition') = aa.addition_name
)
ORDER BY aa.addition_name;

Find documents with a specific addition

WITH RECURSIVE 
  vars(addition_name) AS (
    VALUES('CON-ARK I-40 COMMERCIAL SUB (PLATS F 31 & F 58)')  -- Replace with the addition name you want to find
  )
SELECT 
  json_extract(data, '$.id') as document_id,
  json_extract(data, '$.json.filedDate') as filed_date,
  json_extract(data, '$.json.subdivisionLegal') as subdivisions
FROM documents 
WHERE json_valid(data)
AND EXISTS (
  SELECT 1 
  FROM json_each(json_extract(data, '$.json.subdivisionLegal'))
  WHERE json_extract(value, '$.addition') = (SELECT addition_name FROM vars)
);

Find documents older than a certain date

WITH RECURSIVE 
  vars(start_date) AS (
    VALUES('2003-03-01')  -- Replace with your start date in YYYY-MM-DD format
  )
SELECT 
  json_extract(data, '$.id') as document_id,
  json_extract(data, '$.json.filedDate') as filed_date,
  json_extract(data, '$.json.instrumentType') as instrument_type,
  json_extract(data, '$.json.bookType') as book_type,
  json_extract(data, '$.json.tags') as tags
FROM documents 
WHERE json_valid(data)
AND json_extract(data, '$.json.filedDate') < (SELECT start_date FROM vars)
AND json_extract(data, '$.json.filedDate') IS NOT NULL
ORDER BY json_extract(data, '$.json.filedDate') ASC;

Find documents with an invalid addition

-- this query will find all docs that have an addition in a subdivisionLegal that is NOT contained in the additions field of autocompletes

WITH authorized_additions AS (
  SELECT DISTINCT json_extract(data, '$.value') as addition_name
  FROM autocompletes
  WHERE json_valid(data)
  AND json_extract(data, '$.type') = 'addition'
)
SELECT DISTINCT
  json_extract(d.data, '$.id') as document_id,
  json_extract(d.data, '$.json.filedDate') as filed_date,
  json_extract(value, '$.addition') as unauthorized_addition
FROM documents d
CROSS JOIN json_each(json_extract(d.data, '$.json.subdivisionLegal')) as subdivisions
WHERE json_valid(d.data)
AND json_extract(value, '$.addition') IS NOT NULL
AND json_extract(value, '$.addition') NOT IN (
  SELECT addition_name 
  FROM authorized_additions
)
ORDER BY filed_date DESC;

Find potentially duplicated documents


-- this query will find all documents with the same instrumentNumber and filedDate
WITH duplicates AS (
  SELECT 
    json_extract(data, '$.id') as doc_id,
    COUNT(*) as occurrence_count,
    GROUP_CONCAT(rowid) as rowids
  FROM documents 
  WHERE json_valid(data)
  AND json_extract(data, '$.id') IS NOT NULL
  GROUP BY json_extract(data, '$.id')
  HAVING COUNT(*) > 1
)
SELECT 
  d.doc_id,
  d.occurrence_count,
  d.rowids,
  GROUP_CONCAT(
    json_extract(docs.data, '$.json.filedDate') || ' (rowid: ' || docs.rowid || ')'
  ) as filed_dates,
  GROUP_CONCAT(
    COALESCE(json_extract(docs.data, '$.json.instrumentNumber'), 'NULL') || ' (rowid: ' || docs.rowid || ')'
  ) as instrument_numbers
FROM duplicates d
JOIN documents docs ON docs.rowid IN (
  SELECT value 
  FROM json_each('["' || REPLACE(d.rowids, ',', '","') || '"]')
)
GROUP BY d.doc_id
ORDER BY d.occurrence_count DESC;

Python Scripts

The following is a list of a few Python scripts that can be used to interact with the document group export database. These scripts can be run using Python 3 and the sqlite3 library. If you need an assist, or have a specific use case, please contact our support team for help.

Export all documents to individual JSON files

import sqlite3
import json
import sys

def export_documents_to_json(db_path, output_path):
    """
    Export documents from SQLite database to JSON format as individual .json files.
    
    Args:
        db_path (str): Path to the SQLite database
        output_path (str): Path where the JSONL file will be saved
    """
    # Connect to the database
    conn = sqlite3.connect(db_path)
    cursor = conn.cursor()
    
    # Query to get all documents
    query = """
    SELECT json_extract(data, '$')
    FROM documents
    WHERE json_valid(data)
    """
    
    try:
        cursor.execute(query)

        # Process each row
        for row in cursor:

            # Write the the json to a file that is named after the document id
            if row[0] is not None and row[0] != '[]':
                data = json.loads(row[0])
                document_id = data['id']
                filepath = f"{output_path}/{document_id}.json"
                print(f"Writing document {document_id} to {filepath}")
                with open(filepath, 'w', encoding='utf-8') as f:
                    json.dump(data, f, indent=2, ensure_ascii=False)
                
    finally:
        cursor.close()
        conn.close()



if __name__ == "__main__":
    # get db_path from first argument
    db_path = sys.argv[1]

    # get output_path from second argument
    output_path = sys.argv[2]

    export_documents_to_json(db_path, output_path)