Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

Welcome to the Power Users community on Codidact!

Power Users is a Q&A site for questions about the usage of computer software and hardware. We are still a small site and would like to grow, so please consider joining our community. We are looking forward to your questions and answers; they are the building blocks of a repository of knowledge we are building together.

Comments on How can I do a one-time data import to DokuWiki to create many pages?

Parent

How can I do a one-time data import to DokuWiki to create many pages?

+3
−0

I am new to DokuWiki and have done the basic setup for a private wiki. I want to use it as an alternative to a spreadsheet I've been using to track my ratings and other notes for a category of products. (The spreadsheet hasn't been workable for me on my phone.) I am creating pages, one per specific product.[1]

This spreadsheet has hundreds of rows, and I'd like to script the creation of all those pages. DokuWiki page source is just text files, so I tried creating a page manually in the data/pages/ directory, but it didn't show up. Presumably I need to also edit some metadata, but this is where I'm having trouble finding my way around.

I found the CSV plugin but it creates one big table -- not what I want. I also found the Struct plugin, but it seems more complicated and more rigid than I want -- yes I have a spreadsheet now, which is inherently structured, but as I add to this wiki I want to be free to adjust individual pages. For example, sometimes I have more than one rating, recorded on different dates, and I want those to be grouped on one page. In some cases I'll want to add external links. So I'm looking for an initial structured import, but I want the resulting pages to be plain old wiki text, freely editable. Ideally, I would like to find the simplest approach that works for this one-time data-import problem; the closer to bare-bones DokuWiki I stay, the better.

I'm only asking about the DokuWiki side of this, not how to write a script to pull values out of the spreadsheet. Assume I already have blocks of Markdown suitable as source for wiki pages; I'm trying to add them. (Added this paragraph in response to a comment.)


  1. It's my beer ratings, so it's important to be able to easily look up "have I had this before? what did I think of it?" from a restaurant. I'd been using sites like RateBeer and BeerAdvocate, but I need more flexibility in my note-taking so I'd rather keep my own data than depend on a third-party service. ↩︎

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.
Why should this post be closed?

1 comment thread

Split up? (5 comments)
Post
+2
−0

While I was writing the answer, I realized that actually Selenium can be used to do this without any need for Dokuwiki plugins that support your specific use case. Selenium is an automated Chrome, where you can have a Python script open Chrome, go to a URL, find a link, click it, fill out some forms, click a button, etc.

You would write a Python script like:

beers = load_spreadsheet_into_list_of_dicts()
start_up_selenium()

for b in beers:
    page_text = generate_dokuwiki_text(b)
    submit_with_selenium(page_text)

start_up_selenium is some boilerplate for initializing Selenium and Chrome. If you don't enable headless mode you'll even see it open a Chrome window when you run the script.

generate_dokuwiki_text will be a simple function that takes a dictionary, and fills in some Dokuwiki-syntaxed text template with the information from the dictionary (the column values of that beer).

submit_with_selenium would have the actual selenium code that goes to the URL for DW's "Create Page" page, pastes the text into the text box, and clicks the "save page" button.

I'm purposely leaving the details vague so they can be posts on software.codidact.com (unless Selenium docs/Google aren't enough).

This is a little bit easier if you're self-hosting, or know the admin of the DW. Sometimes admins get mad when you run bots to create pages :)

Selenium is a weird way to accomplish this, but on the other hand, once you do learn how to do this kind of thing with Selenium, there's a lot of other cool things you can apply your knowledge to. So arguably it's a more useful thing to learn than a specialized import plugin of one specific wiki engine.

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.

1 comment thread

I wouldn't have thought of Selenium (which I've heard of but not used), so thanks for that. (2 comments)
I wouldn't have thought of Selenium (which I've heard of but not used), so thanks for that.
Monica Cellio‭ wrote 11 months ago · edited 11 months ago

I wouldn't have thought of Selenium (which I've heard of but not used), so thanks for that. (And yes, I'm self-hosting, so I don't need to worry about angry admins.)

matthewsnyder‭ wrote 11 months ago · edited 11 months ago

Good luck! Once you get past some boilerplate it has, it actually has a pretty intuitive API (compared to others like beautifulsoup which are themselves considered "easy"). Glancing at their docs, also, I think it's gotten a little bit easier since I last used it.

When finding the elements, I find that the best way is to use the id if one is set in the HTML, CSS selector otherwise.

In most browsers, you can right click on the element and do "Inspect", to see the part of the HTML for that element where you can see its attributes etc. Firefox lets you right click on the HTML in inspector and do "Copy / CSS selector", although the selector it creates is often overly specific. That won't matter in your case, since you'll only do it once and don't need to worry about future proofing your script against site redesigns.

In Firefox Inspector, if you do Ctrl+F and type a selector in the search, it will work. That's a good way of testing selectors to see what they match.