-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SEO: rewrite certain xml:id values and validate against Geekodoc #364
Conversation
139325d
to
fc28462
Compare
Phew this will break a lot of external links right? |
Did we ever say that our links live forever? All the other SUSE documentation had to take this poison pill, so why not ours? |
fc28462
to
c012ad1
Compare
No but that doesn't mean we should not give a 💩 either. obs-landing knows redirects. At least for the sections etc. (everything but anchors) we should do those. Like
|
@hennevogel OK, I'll create a companion PR at obs-landing. |
86e8c96
to
55dbf51
Compare
Otherwise we break an unknown number of external hyperlinks. Refers-to: openSUSE/obs-docu#364 Signed-off-by: Nathan Cutler <[email protected]>
Otherwise we break an unknown number of external hyperlinks. Refers-to: openSUSE/obs-docu#364 Signed-off-by: Nathan Cutler <[email protected]>
bb27e05
to
03f4eff
Compare
That's true, we faced this problem in the past. We mitigated this by adding redirect rules in our Apache Still we face this problem from time to time when some clever managers want to change the product names. 😉 If IDs are part of URLs, there is not much you can do to avoid this problem. Sometimes it's inevitable. You can make people aware of that to mitigate the issue. For the record, our styleguide contains a section about Identifiers. Just to give you an idea how we use and create IDs. |
As already suggested by @hennevogel I have opened a companion PR with corresponding redirects: openSUSE/obs-landing#466 |
Yes, saw it. Just wanted to give you some context from our side. 🙂 |
We seem to have reached a consensus that migrating to Geekodoc is the way to go, so as long as there are no objections to the additional changes I have added, this PR seems ready to go in (together with its counterpart PR over at obs-landing). |
For SEO optimization, we need to avoid the use of dots ('.') and underscores ('_') in our xml:id values. Signed-off-by: Nathan Cutler <[email protected]>
For SEO optimization, we need to avoid the use of dots ('.') and underscores ('_') in our xml:id values. Signed-off-by: Nathan Cutler <[email protected]>
Signed-off-by: Nathan Cutler <[email protected]>
The xml:id values in the XML file were rewritten, but not in the correct way. Signed-off-by: Nathan Cutler <[email protected]>
The term "open build service" (without initial caps) is wrong, anyway. Signed-off-by: Nathan Cutler <[email protected]>
It doesn't make sense to hyperlink to a figure that follows immediately after the link. Signed-off-by: Nathan Cutler <[email protected]>
This comment wasn't adding any value. Signed-off-by: Nathan Cutler <[email protected]>
This FIXME was confused by the presence of "&gitproject;" in the referencespec. It doesn't actually link to GitHub. Signed-off-by: Nathan Cutler <[email protected]>
I don't know what the original intention was here, but these days it's just confusing. Signed-off-by: Nathan Cutler <[email protected]>
The art-obs-quickstart tag has been here since the beginning, but it doesn't mean anything in the given context and hence is just decreasing the overall signal-to-noise ratio. Signed-off-by: Nathan Cutler <[email protected]>
There was some confusion over whether "Source Services" should be plural or singular, and in some places we were referring to Source Services merely as "services" in contexts where it might not be abundantly clear to the reader what is meant by "services". Signed-off-by: Nathan Cutler <[email protected]>
As a first-time reader I was confused by the current text, so I quickly came up with some edits to make the chapter more intelligible. Signed-off-by: Nathan Cutler <[email protected]>
03f4eff
to
8c0a735
Compare
Signed-off-by: Nathan Cutler <[email protected]>
7754e7b
to
6fc1afa
Compare
Signed-off-by: Nathan Cutler <[email protected]>
Signed-off-by: Nathan Cutler <[email protected]>
Signed-off-by: Nathan Cutler <[email protected]>
I've did a quick validation check and there is more work ahead. 😉 I forgot to mention that the @smithfarm Nathan, I've experimented a bit to make this work more efficient. After I was unsuccessful with # modifydocbook.py
import re
import sys
pattern = re.compile(r'(xml:id|linkend)="([^"]+)"') # Regular expression pattern
def modify_docbook_xml(file_path):
"""
Read a DocBook XML file by replacing dots and underscores with dashes
in xml:id and linkend attributes.
Prints the modified content to standard output.
Args:
file_path: Path to the original DocBook XML file.
"""
with open(file_path, 'r') as fh:
for line in fh:
match = pattern.search(line)
if match:
# line = line.rstrip()
attr_value = match.group(2).replace('.', '-').replace('_', '-')
replacement = rf'\1="{attr_value}"'
line = pattern.sub(replacement, line)
print(line, end="")
if __name__ == "__main__":
modify_docbook_xml(sys.argv[1]) If you install $ for xml in xml/*.xml; do
python3 modifydocbook.py $xml | sponge $xml
done After this process, there is one ID ( |
Otherwise we break an unknown number of external hyperlinks. Refers-to: openSUSE/obs-docu#364 Signed-off-by: Nathan Cutler <[email protected]>
Signed-off-by: Nathan Cutler <[email protected]>
@tomschr Thanks! But I believe I did that already... Do you see any linkend attribute value that I missed? |
daps thinks otherwise. 😉 It reports some xml:id and linkend errors. I count 14 files that are modified after I apply my script. 🙂 |
Signed-off-by: Nathan Cutler <[email protected]>
@tomschr Instead of me duplicating your work, could you just push the changes right into the PR? Or just push a branch with the commit and I'll cherry-pick it in... |
@tomschr The odd thing is, Geekodoc no longer complains about the I mean, I'm trying to get to them all, but I found I missed some - yet Geekodoc (daps) did not complain. |
Where? Here in the PR it says this:
I thought this meant that daps thinks the XML is valid. |
Use the option
If that's the case, you still validate against DocBook 5. However, DocBook doesn't have any restriction regarding IDs. To really validate against Geekdoc, you need daps to tell what schema to use. In your DOCBOOK5_RNG_URI="urn:x-suse:rng:v2:geekodoc-flat" Then validate again and you should hopefully see some messages. For me, it looks like this:
|
GeekoDoc restricts identifiers in xml:id/linkend attributes and allows only specific characters (no dots, underscores). * Replace dots/underscores with "-" * Correct ROOTID in DC files * Set validation schema in DC files to GeekoDoc
@smithfarm Absolutely! I just didn't want to intervene with your work so my next questions would be exactly that. 👍 I've committed the changes in commit 05a8958, see the commit message for details. |
@tomschr Thanks! Much appreciated. |
Otherwise we break an unknown number of external hyperlinks. Refers-to: openSUSE/obs-docu#364 Signed-off-by: Nathan Cutler <[email protected]>
c687115
to
05a8958
Compare
@@ -3,15 +3,15 @@ | |||
<!ENTITY % entities SYSTEM "entity-decl.ent"> | |||
%entities; | |||
]> | |||
<chapter version="5.1" xml:id="cha-obs-source-service" | |||
<chapter version="5.1" xml:id="cha-obs-source-services" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fear we are going to need redirects for this too...
@@ -4,25 +4,36 @@ | |||
<!ENTITY % entities SYSTEM "entity-decl.ent"> | |||
%entities; | |||
]> | |||
<chapter version="5.1" xml:id="cha-obs-building" | |||
<chapter version="5.1" xml:id="cha-obs-local-building" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a redirect to obs-landing
For SEO optimization, we need to avoid the use of dots ('.') and underscores ('_') in our xml:id values.
To be merged together with companion PR: openSUSE/obs-landing#466