Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

Welcome to the Power Users community on Codidact!

Power Users is a Q&A site for questions about the usage of computer software and hardware. We are still a small site and would like to grow, so please consider joining our community. We are looking forward to your questions and answers; they are the building blocks of a repository of knowledge we are building together.

Comments on How can I do a one-time data import to DokuWiki to create many pages?

Post

How can I do a one-time data import to DokuWiki to create many pages?

+3
−0

I am new to DokuWiki and have done the basic setup for a private wiki. I want to use it as an alternative to a spreadsheet I've been using to track my ratings and other notes for a category of products. (The spreadsheet hasn't been workable for me on my phone.) I am creating pages, one per specific product.[1]

This spreadsheet has hundreds of rows, and I'd like to script the creation of all those pages. DokuWiki page source is just text files, so I tried creating a page manually in the data/pages/ directory, but it didn't show up. Presumably I need to also edit some metadata, but this is where I'm having trouble finding my way around.

I found the CSV plugin but it creates one big table -- not what I want. I also found the Struct plugin, but it seems more complicated and more rigid than I want -- yes I have a spreadsheet now, which is inherently structured, but as I add to this wiki I want to be free to adjust individual pages. For example, sometimes I have more than one rating, recorded on different dates, and I want those to be grouped on one page. In some cases I'll want to add external links. So I'm looking for an initial structured import, but I want the resulting pages to be plain old wiki text, freely editable. Ideally, I would like to find the simplest approach that works for this one-time data-import problem; the closer to bare-bones DokuWiki I stay, the better.

I'm only asking about the DokuWiki side of this, not how to write a script to pull values out of the spreadsheet. Assume I already have blocks of Markdown suitable as source for wiki pages; I'm trying to add them. (Added this paragraph in response to a comment.)


  1. It's my beer ratings, so it's important to be able to easily look up "have I had this before? what did I think of it?" from a restaurant. I'd been using sites like RateBeer and BeerAdvocate, but I need more flexibility in my note-taking so I'd rather keep my own data than depend on a third-party service. ↩︎

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.
Why should this post be closed?

1 comment thread

Split up? (5 comments)
Split up?
matthewsnyder‭ wrote 9 months ago

I think there's several things being asked here. Or at least it seems like any solution will almost certainly have to decompose the problem:

  1. Iterate over the records in a spreadsheet, one at a time. You probably want to export to something simple like CSV or JSON if the original spreadsheet is complicated like Excel.
  2. For each record, write a small script (I feel like you have to do some scripting here) to take the values in each column, and compose it into a text file. You could use just simple string concatenation in Python, or get fancy with template engines like Jinja.
  3. You'll now end up with a lot of text files, one per record. Looking at https://www.dokuwiki.org/devel:dirlayout DW keeps a lot of metadata besides just the pages, so this is probably why just putting the files under data/pages didn't work. You need to use DW's own features to import the files, and those probably expect something like Markdown. You can convert Markdown to other stuff with pandoc. (...)
matthewsnyder‭ wrote 9 months ago

(...) 4. Lastly it sounds like there are some cases where multiple records are actually the same beer but different dates etc and need to be combined in a single page. This is trickier. If there's not too many such cases it would be easier to do it manually. If there's a lot, I think you would probably have to convert it into a nested JSON, so that each element in the list of records can itself be a list of (related) records. Easiest way of doing that IMO is to iterate through the CSV and build a dictionary, which you then import to JSON.

IMO these 4 should be separate questions. I know how to do 1, 2 and 4. I don't know 3 and don't have time right now to set up a test DW, so I can't post a full answer. 😢 But if they were separate, "how to import some markdowns into DW" seems like someone would know it (I even found https://www.dokuwiki.org/plugin:docimporter).

matthewsnyder‭ wrote 9 months ago · edited 9 months ago

Also, if DW is not cooperating on imports, you have two more "nuclear options":

  1. Export a backup of DW. Change it to contain pages you want. "Restore" the backup. This assumes you won't hit the metadata problem again.
  2. When submitting a DW page, have inspector view open and capture the HTTP request. It's probably some POST with username/token/cookie in headers and the page content as request body. You can then use Python requests, curl, HTTPie etc. in a script to submit each of your pages to the Dokuwiki URL (I assume this is self-hosted). It's a bit overkill, since a wiki should really provide UX for importing easily, but not as hard as it sounds.

I suppose you could also use Selenium to literally go to the DW page, paste in the text from each row of the spreadsheet, and click "save page" for you, and even watch Selenium do it in a Chrome window.

Monica Cellio‭ wrote 9 months ago

I'll edit to clarify. I'm not asking for help on the scripting part; in fact, I've already transformed that spreadsheet into suitable blocks of markdown source. I'm just asking about the DokuWiki side of actually getting those pages (with the needed metadata) into the wiki.

matthewsnyder‭ wrote 9 months ago

Ah, got it. So it really comes down to figuring out a way to bulk import into DW at this point.

I tried searching a bit when I saw this question, and I was very surprised that DW docs don't have a whole section explaining how exactly to import markdown. They do discuss importing from other wikis, but I'm not sure if that helps. It seems like if it's not gonna be easier than just doing the Selenium, it's kind of pointless :)