2019年10月15日

org-mode在publish时去掉多余的div和id

[2019-10-15 Tue 11:19]

blog是用org mode写的，在publish时，会自动很多<div>和div的id，显得比较乱。需要把多余的<div>和id都去掉。

在publish生成html时去掉多余的<div>

修改 lisp/org/ox-html.el 中的 org-export-define-backend 中的 (headline . org-html-headline)

用 advice-add override 原来的 org-html-headline ，去掉生成headline的 <div id="org???"></div> 去掉。

(advice-add #'org-export-get-reference :override #'unpackaged/org-export-get-reference)

(advice-add #'org-html-headline :override #'albert|org-html-headline)

(defun albert|org-html-headline (headline contents info)
  "Transcode a HEADLINE element from Org to HTML.
CONTENTS holds the contents of the headline.  INFO is a plist
holding contextual information."
  (unless (org-element-property :footnote-section-p headline)
    (let* ((numberedp (org-export-numbered-headline-p headline info))
           (numbers (org-export-get-headline-number headline info))
           (level (+ (org-export-get-relative-level headline info)
                     (1- (plist-get info :html-toplevel-hlevel))))
           (todo (and (plist-get info :with-todo-keywords)
                      (let ((todo (org-element-property :todo-keyword headline)))
                        (and todo (org-export-data todo info)))))
           (todo-type (and todo (org-element-property :todo-type headline)))
           (priority (and (plist-get info :with-priority)
                          (org-element-property :priority headline)))
           (text (org-export-data (org-element-property :title headline) info))
           (tags (and (plist-get info :with-tags)
                      (org-export-get-tags headline info)))
           (full-text (funcall (plist-get info :html-format-headline-function)
                               todo todo-type priority text tags info))
           (contents (or contents ""))
       (ids (delq nil
                      (list (org-element-property :CUSTOM_ID headline)
                            (org-export-get-reference headline info)
                            (org-element-property :ID headline))))
           (preferred-id (car ids))
           (extra-ids
        (mapconcat
         (lambda (id)
           (org-html--anchor
        (if (org-uuidgen-p id) (concat "ID-" id) id)
        nil nil info))
         (cdr ids) "")))
      (if (org-export-low-level-p headline info)
          ;; This is a deep sub-tree: export it as a list item.
          (let* ((html-type (if numberedp "ol" "ul")))
        (concat
         (and (org-export-first-sibling-p headline info)
          (apply #'format "<%s class=\"org-%s\">\n"
             (make-list 2 html-type)))
         (org-html-format-list-item
                   contents (if numberedp 'ordered 'unordered)
           nil info nil
                   (concat (org-html--anchor preferred-id nil nil info)
                           extra-ids
                           full-text)) "\n"
         (and (org-export-last-sibling-p headline info)
          (format "</%s>\n" html-type))))
    ;; Standard headline.  Export it as a section.
        (let ((extra-class (org-element-property :HTML_CONTAINER_CLASS headline))
              (first-content (car (org-element-contents headline))))
          (format "%s%s\n"
                  (format "\n<h%d>%s</h%d>\n"
                          level
                          (concat
                           (and numberedp
                                (format
                                 "<span class=\"section-number-%d\">%s</span> "
                                 level
                                 (mapconcat #'number-to-string numbers ".")))
                           full-text)
                          level)
                  (if (eq (org-element-type first-content) 'section) contents
                    (concat (org-html-section first-content "" info) contents))))))))

advice-add 的讨论参考 https://emacs-china.org/t/defadvice-advice-add/7355/19

advice-add override某个函数的方法参考 https://github.com/alphapapa/unpackaged.el#export-to-html-with-useful-anchors

修改org-html-section，去掉每个headline后面的<div>块

(advice-add #'org-html-section :override #'albert|org-html-section)

(defun albert|org-html-section (section contents info)
  "Transcode a SECTION element from Org to HTML.
CONTENTS holds the contents of the section.  INFO is a plist
holding contextual information."
  (let ((parent (org-export-get-parent-headline section)))
    ;; Before first headline: no container, just return CONTENTS.
    (if (not parent) contents
      ;; Get div's class and id references.
      (let* ((class-num (+ (org-export-get-relative-level parent info)
               (1- (plist-get info :html-toplevel-hlevel))))
         (section-number
          (and (org-export-numbered-headline-p parent info)
           (mapconcat
            #'number-to-string
            (org-export-get-headline-number parent info) "-"))))
        ;; Build return value.
    (format "\n%s\n" (or contents ""))))))

源代码块可以修改 `org-html-src-block`

未修改。

list可以修改 `org-html-format-list-item`

未修改。

修改<div id="content">

有空再找找具体在哪里。

由于生成的正文是包裹在 <div id="content"> 中的，用的是id，而不是class。main.css中只好加一个 #content 了。

在publish时生成html后进行filter

[2019-10-15 Tue 11:45]

可以考虑生成html后进行过滤，去掉<div>中多余的id中包含 org 关键字的属性。用下面这个函数的效果是不错的。

但这样做多余的<div>还在，不好看。也可以考虑在生成html时就不产生多余的<div>

(defun html-body-id-filter (output backend info)
  "Remove random ID attributes generated by Org."
  (when (eq backend 'html)
    (replace-regexp-in-string
     " id=\"[[:alpha:]-]*org[[:alnum:]]\\{7\\}\""
     ""
     output t)))

(add-to-list 'org-export-filter-final-output-functions 'html-body-id-filter)

参考 https://emacs.stackexchange.com/questions/36366/disable-auto-id-generation-in-org-mode-html-export

参考了下面的url

https://stackoverflow.com/questions/13340616/assign-ids-to-every-entry-in-org-mode

https://stackoverflow.com/questions/27132422/reference-unique-id-across-emacs-org-mode-files

https://github.com/alphapapa/unpackaged.el#export-to-html-with-useful-anchors 这个可以参考一下

https://writequit.org/articles/emacs-org-mode-generate-ids.html

本机测试静态页面

blog的css和图片使用了绝对路径，直接双击html会找不到路径，加载不了css和图片，需要用http server跑静态页面。

在windows上搞个nginx跑静态页面大材小用。可以用python的http module。下面是python 3的命令。默认监听 127.0.0.1:8000。浏览器访问 http://127.0.0.1:8000

cd georgealbert.io
python -m http.server