Skip to content

Commit

Permalink
ORC-1112: Add Using with Python web page (#1039)
Browse files Browse the repository at this point in the history
### What changes were proposed in this pull request?

This PR aims to add `Using with Python` web page to Apache ORC website for the community Python users.

### Why are the changes needed?

To help Python users to use `Apache Arrow` project more with latest `Apache ORC 1.7.x C++` release.

### How was this patch tested?

Build the doc and check generated website. The embedded code can be test with `PyArrow 6.0.1 (latest)` and will be improved at `PyArrow 7.0` via [ARROW-15338: [Python] Add pyarrow.orc.read_table API](apache/arrow@ff4b9be)

<img width="581" alt="Screen Shot 2022-02-01 at 2 28 15 PM" src="https://user-images.githubusercontent.com/9700541/152062188-d9d3309a-9367-49dc-b8ea-0f4bac8d9919.png">

<img width="100%" alt="Screen Shot 2022-02-01 at 2 29 23 PM" src="https://user-images.githubusercontent.com/9700541/152062356-934b366f-040b-4fa7-8beb-27ae786e028b.png">

This closes #1027
  • Loading branch information
dongjoon-hyun authored Feb 2, 2022
1 parent bbcc1d0 commit 50ba8bb
Show file tree
Hide file tree
Showing 3 changed files with 42 additions and 0 deletions.
4 changes: 4 additions & 0 deletions site/_data/docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,10 @@
- building
- releases

- title: Using in Python
docs:
- pyarrow

- title: Using in Spark
docs:
- spark-ddl
Expand Down
37 changes: 37 additions & 0 deletions site/_docs/pyarrow.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
---
layout: docs
title: PyArrow
permalink: /docs/pyarrow.html
---

## How to install

Apache Arrow project's PyArrow is the recommended package.

https://pypi.org/project/pyarrow/

```
pip3 install pyarrow
pip3 install pandas
```

## How to write and read an ORC file

```
In [1]: import pandas as pd
In [2]: import pyarrow as pa
In [3]: import pyarrow.orc as orc
In [4]: orc.write_table(pa.table({"col1": [1, 2, 3]}), "test.orc")
In [5]: t = orc.ORCFile("test.orc").read()
In [6]: t.to_pandas()
Out[6]:
col1
0 1
1 2
2 3
```
1 change: 1 addition & 0 deletions site/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@ <h2>Complex Types</h2>
<div class="unit golden-large code">
<p class="title">Quickstart Documentation</p>
<ul class="shell">
<li><a href="docs/pyarrow.html">Using with Python</a></li>
<li><a href="docs/spark-ddl.html">Using with Spark</a></li>
<li><a href="docs/hive-ddl.html">Using with Hive</a></li>
<li><a href="docs/mapred.html">Using with Hadoop MapRed</a></li>
Expand Down

0 comments on commit 50ba8bb

Please sign in to comment.