Skip to content

Commit 7ce3658

Browse files
Merge pull request #291 from cyyeh/feature/update-doc
update docs
2 parents bc00c09 + c0ca9b2 commit 7ce3658

File tree

6 files changed

+106
-5
lines changed

6 files changed

+106
-5
lines changed
Lines changed: 89 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,89 @@
1+
---
2+
date: 2023-08-30
3+
authors:
4+
name: Jimmy Yeh
5+
title: core member of VulcanSQL
6+
url: https://github.com/cyyeh
7+
image_url: https://avatars.githubusercontent.com/u/11023068?v=4
8+
email: jimmy.yeh@cannerdata.com
9+
---
10+
11+
# Query Data from the Internet and Deliver APIs in no time
12+
13+
*TLDR: VulcanSQL, a free and open-source data API framework built specifically for data applications,
14+
empowers data professionals to generate and distribute data APIs quickly and effortlessly.
15+
It takes your SQL templates and transforms them into data APIs, with no backend expertise necessary.*
16+
17+
## One Way to Understand APIs
18+
19+
As an API designer, we can think of APIs composed of three components, namely **input**, **transformation** and **output**.
20+
21+
Let's start with the input component, we need to consider what are data sources of APIs. Generally, data sources
22+
can be databases, files on the FTP server, etc. After we decide what data sources our APIs support, we need to
23+
also support different mechanisms in order to get data from data sources.
24+
25+
Then, with the transformation part, it generally is where we handle business logic. Finally, the output part means the destination
26+
of the APIs and also the mechanisms we deliver the APIs, such as RESTful APIs, GraphQL, etc.
27+
28+
Now, let me take VulcanSQL as a quick example, it emphasizes you can write SQL templates in the transformation part,
29+
and it currently supports RESTful APIs for the output part. As of the input, please read the following content to grasp the full story!
30+
31+
![vulcansql](./static/query-data-from-the-internet-and-deliver-apis-in-no-time/vulcansql.png)
32+
33+
<!--truncate-->
34+
35+
## The Input Part: Data Sources
36+
37+
VulcanSQL aims to help data professionals create and deliver data APIs in an easy way!
38+
Originally, VulcanSQL supports data warehouses and some databases such as [BigQuery](../docs/connectors/bigquery),
39+
[Snowflake](../docs/connectors/snowflake), [ClickHouse](../docs/connectors/clickhouse) and [PostgreSQL](../docs/connectors/postgresql), etc. However, as we share VulcanSQL to the world
40+
along the way, we figured out that there are also lots of data on the Internet that is not in databases
41+
such as CSV files or data that lives in other people's databases!
42+
43+
That's what we're going to share with you next: how VulcanSQL can help you get data from the Internet!
44+
45+
## How VulcanSQL can help?
46+
47+
As of now, VulcanSQL provides two mechanisms to help you get data from the Internet.
48+
49+
### DuckDB and its httpfs extension
50+
51+
In VulcanSQL, we can use [DuckDB](../docs/connectors/duckdb) as a caching layer to [enhance query performance](./powering-rapid-data-apps-with-vulcansql), or as a data connector.
52+
For those of you who may not be familiar with DuckDB, it is a high performance in-process OLAP database,
53+
and has lots of extensions available! In order to get data from the Internet, VulcanSQL supports the httpfs extension!
54+
55+
With the [httpfs extension](https://duckdb.org/docs/extensions/httpfs.html), now VulcanSQL can query CSV, JSON and Parquet files from the Internet!
56+
57+
Imagine now you find some interesting dataset on the Internet, and it's a CSV file! Now you can directly query it
58+
like the following SQL statement, then you can do some data transformation using SQL, and deliver APIs right away to share with others!
59+
60+
```sql
61+
SELECT
62+
*
63+
FROM 'https://www.stats.govt.nz/assets/Uploads/Annual-enterprise-survey/Annual-enterprise-survey-2021-financial-year-provisional/Download-data/annual-enterprise-survey-2021-financial-year-provisional-csv.csv'
64+
```
65+
66+
The image below shows the data details using VulcanSQL's [API Catalog feature](../docs/catalog/intro)
67+
![vulcansql-httpfs](./static/query-data-from-the-internet-and-deliver-apis-in-no-time/vulcansql-httpfs.png)
68+
69+
If you would like to read the source code of the full example, please [check out here](https://github.com/Canner/vulcan-sql-examples/tree/main/read-data-from-internet)!
70+
71+
### The API Extension
72+
73+
Sometimes, you may find some interesting data from RESTful APIs created by others. VulcanSQL now has [the API extension](../docs/extensions/api) that
74+
allows you to query data from 3rd parties using RESTful APIs!
75+
76+
In the following example, we call the RESTful API to `https://dummyjson.com` and search their products with a query string!
77+
78+
```sql
79+
{% set a_variable_you_can_define = { "query": { "q": "phone" } } %}
80+
SELECT {{ a_variable_you_can_define | rest_api(url='https://dummyjson.com/products/search') }}
81+
```
82+
83+
Besides, the API Extension not only supports the GET method, but also other HTTP methods as well!
84+
85+
If you would like to read the source code of the full example, please [check out here](https://github.com/Canner/vulcan-sql-examples/tree/main/restapi-caller).
86+
87+
## Conclusion
88+
89+
We hope this blog post demonstrates how VulcanSQL can help you query data from the Internet, thus you can create and deliver APIs in no time!

packages/doc/blog/querying-your-data-easily-and-smartly-through-huggingface.mdx

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -63,9 +63,9 @@ VulcanSQL currently integrates the table question answering feature by creating
6363
`huggingface_table_question_answering` and allows you to apply functions to variables using the
6464
pipe operator (`|`).
6565

66-
**Sample 1 - send the data from the variable [`set` tag](https://www.notion.so/VulcanSQL-edb87d04de074125ab19275e6f63d844?pvs=21):**
66+
**Sample 1 - send the data from the [`set` tag](../docs/develop/advanced#set-variables):**
6767

68-
You could give the dataset with the **[`set` tag](https://www.notion.so/VulcanSQL-edb87d04de074125ab19275e6f63d844?pvs=21)**
68+
You could give the dataset with the **`set` tag**
6969
and give the question with the `query` field:
7070

7171
```sql
@@ -104,7 +104,7 @@ Here is a response returned by `huggingface_table_question_answering`:
104104
The result will be converted to a JSON string from `huggingface_table_question_answering`.
105105
You could decompress the JSON string and use the result by yourself.
106106

107-
**Sample 2 - send the data from the `req` tag:**
107+
**Sample 2 - send the data from the [`req` tag](../docs/develop/predefined-queries):**
108108

109109
You could also use the `req` tag to keep the query result from the previous SQL condition and save it
110110
to a variable named `repositories`. Then you can use `.value()` to get the data result and

packages/doc/docs/develop/cache.mdx

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -63,6 +63,18 @@ cache:
6363

6464
In this configuration, the `cache_departments` table will be utilized within the `{% cache %}` tag.
6565

66+
Also, you can add the refresh interval configuration in the yaml file in the cache section using the `refreshTime` keyword.
67+
68+
```yaml
69+
cache:
70+
- cacheTableName: 'cache_departments' # The name of the table in the cache layer storage
71+
...
72+
refreshTime: { every: '5m' }
73+
```
74+
75+
:::info
76+
The time format used in `refreshTime` should be compliant with the [`ms`](https://www.npmjs.com/package/ms) package.
77+
:::
6678

6779
## Reusing Cached Results
6880
VulcanSQL provides the ability to keep the query result from the cache layer in a variable, which can be reused in subsequent queries. For example:

packages/doc/docs/extensions/huggingface/huggingface-table-question-answering.mdx

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ The [Table Question Answering](https://huggingface.co/docs/api-inference/detaile
66

77
The result will be converted to a JSON string from `huggingface_table_question_answering`. You could decompress the JSON string and use the result by itself.
88

9-
**Sample 1 - send the data from variable by [set tag](https://vulcansql.com/docs/develop/advance#set-variables):**
9+
**Sample 1 - send the data from variable by [set tag](../../develop/advanced#set-variables):**
1010

1111
```sql
1212
{% set data = [
@@ -41,7 +41,7 @@ SELECT {{ data | huggingface_table_question_answering(query="How many repositori
4141
]
4242
```
4343

44-
**Sample 2 - send the data from [req tag](https://vulcansql.com/docs/develop/predefined-queries):**
44+
**Sample 2 - send the data from [req tag](../../develop/predefined-queries):**
4545

4646
```sql
4747
{% req artists %}

0 commit comments

Comments
 (0)